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REMARKS 

Claims 1-2, 7, 9, 12-15, 17-18, 20-21, and 23, drawn to non-elected subject matter, were 
withdrawn from consideration by the Examiner. 
Claims 16, 19, and 22 are canceled. 
Claims 3-6, 8, and 10-11 are under consideration. 

Claims 3, as currently amended, is in independent form and now contains all of the 
limitations of original claim 1. Support for this amendment can be found in original claim 1. 

Claim 8, as currently amended, depends from claim 3, which is presently amended to 
contain the limitations of original claim 1. Support for this amendment can be found in original 
claim 1. 

Claims 3, 8, and 10, as currently amended, recite SEQ ID NO:5 or SEQ ID NO: 19. 
Support for these amendments can be found in the corresponding original claims. 

Claims 3 and 8 have been amended to recite "wherein said biologically active fragment 
has sphingosine kinase activity" to further clarify the intended subject matter of the claimed 
invention. Support for these amendments can be found, for example, in the specification at p. 59. 

Claim 3 has been amended to recite "comprising at least 150 contiguous amino acids" 
and claim 11 has been amended to recite "500 contiguous nucleotides." Support for these 
amendments can be found, for example, in the specification at p. 13, line 40 through p. 14, line 4. 

No new matter has been added by any of these amendments. Entry of these amendments 
is therefore respectfully requested. 

Applicants reserve the right to prosecute non-elected subject matter in subsequent 
divisional applications. 

Restriction Requirement 

Applicants respectfully reiterate our traversal to the restriction requirement for at least the 
reasons already made of record. 

Moreover, Applicants note that, as currently amended, all of the claims currently under 
consideration avoid the prior art cited by the Examiner as destroying unity of invention. They 
reiterate their request that the Examiner examine at least claims directed to the polypeptide of 
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SEQ ID NO:5 and the polynucleotide sequence of SEQ ED NO: 19 in this single application. 

Objections 

Claims 3-6 and 8 are objected to as depending from a non-elected claim (claim 1). Claim 
3, as currently amended, is in independent form and now contains all of the limitations of 
original claim 1. Claim 8 now depends from claim 3. Claims 4-6 no longer depend from a non- 
elected claim. Applicants respectfully request that these amendments be entered and that this 
objection be withdrawn. 

Claims 3-6, 8, and 10 are objected to as containing non-elected subject matter (i.e., SEQ 
ID NO:l-4, SEQ ID NO:6-18, and SEQ ID NO:20-28). The claims, as currently amended, recite 
the polypeptide of SEQ ID NO:5 and fragments and variants thereof, and the encoding 
polynucleotide of SEQ ID NO: 19 and fragments and variants thereof. Applicants respectfully 
request that these amendments be entered and that this objection be withdrawn. 

Rejection under 35 U.S.C. §1 12, 2 nd 

Claims 3-6 and 8 are rejected under 35 U.S.C. §112, 2 nd as allegedly being indefinite due 
to the recitation of "biologically active" in original claim 1. Claims 3 and 8, as currently 
amended, recite the limitations of original claim 1. These claims have been further amended to 
recite "wherein said biologically active fragment has sphingosine kinase activity" in order to 
further clarify the meaning of the term "biologically active." One of skill in the art would clearly 
understand that the "biological activity" to which the claims as amended refer is sphingosine 
kinase activity, as this specific activity is now recited explicitly. These amendments are fully 
supported by the disclosure in the present application, and are put forth merely to further clarify 
the claims and to obtain expeditious allowance of the instant application. Applicants expressly 
do not disclaim equivalents of the invention which could include polypeptides having additional 
biological activities other than sphingosine kinase inducing activity. Therefore, Applicants 
respectfully request that the rejection under 35 U.S.C. § 112, second paragraph be withdrawn. 

Rejection under 35 U.S.C. § 101 and 35 U.S.C § 112. 1 st paragraph 

Claims 3-6, 8, and 10-11 stand rejected under 35 U.S.C. §§ 101 and 112, first paragraph, 
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based on the allegation that the claimed invention lacks patentable utility. The rejection alleges 
in particular that "the specification fails to assert what compounds the protein of SEQ ID NO:5 
phosphorylates" and that therefore "the skilled artisan would require further research to identify 
or reasonable confirm a real world context of use." Applicants traverse this rejection for at least 
the following reasons. 

In response to this issue, Applicants direct the Examiner's attention to those points in the 
specification that detail the specificity of the claimed polypeptide. In particular, Applicants 
direct the Examiner's attention to the specification at p. 23, lines 11-18. These lines describe the 
methods with which the claimed polypeptides are characterized. In particular, "column 5 [of 
Table 2] shows the amino acid residues comprising signature sequences and motifs; column 6 
shows homologous sequences as identified by BLAST analysis" and M [t]he methods of column 7 
were used to characterize each polypeptide through sequence homology and protein motifs." 

Turning the Examiner's attention to Table 2 (p. 59, last row), mouse sphingosine kinase 
is cited as a homologous sequence. As the Examiner has recognized, SEQ ID NO:5 is over 80% 
identical to mouse sphingosine kinase. The Examiner has however, seemingly disregarded this 
recitation of a homolog and focused on the identification of a diacylglycerol kinase catalytic 
domain, alleging on that basis that SEQ ID NO:5 is a diacylglycerol kinase while admitting that 
no such assertion was made by Applicants. 

As stated below (Section H.C.), the presence of this domain supports the characterization 
of SEQ ID NO:5 as a sphingosine kinase and indeed, as it was known in the art at the time of 
filing of the instant application, diacylglycerol kinases and sphingosine kinases share regions of 
significant homology (Kohama et ai, JBC 273:23722-8, 1998). Therefore, the identification of a 
diacylglycerol kinase catalytic domain is consistent with the identification of mouse sphingosine 
kinase as a homologous sequence. Thus, there is sufficient basis in the specification for 
identifying SEQ ID NO:5 as a sphingosine kinase. Indeed, upon reading of the specification and 
attached tables, the skilled artisan would have had no reason to doubt that the characterization of 
SEQ ID NO:5 as a human homolog to the disclosed mouse sphingosine kinase. 

Furthermore, an alignment of SEQ ID NO: 5 with a post-filing human sphingosine kinase 
(Nava et al., FEBS Letters 473:81-4 (2000); Reference No. 1) shows that the two sequences are 
approximately 99% identical over the entire 384 amino acid residue length of both sequences. 
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Thus Applicants' assertion that SEQ ID NO:5 is a sphingosine kinase is corroborated by post- 
filing experimental data. 

In sum, no additional research would be required of the skilled artisan to find a real world 
context of use for the claimed invention, as its use as a sphingosine kinase is sufficiently asserted 
in the specification and further supported by additional data in the literature. 

As a preliminary matter Applicants respond to an issue the Examiner raised as part of the 
rejection under 35 U.S.C. § 101. The Examiner alleges that although Applicants assert that the 
claimed polynucleotides are useful for the diagnosis, treatment or prevention of neurological, cell 
proliferative and autoimmune/inflammatory disorders, there is no link of SEQ ID NO: 19 to a 
specific disease state (see Office Action at pp. 7-8). Applicants respectfully disagree and, by way 
* of example, direct the Examiner's attention to p. 57 of the specification (Table 1, row 6). 
Column 5 shows a list of specific cDNA libraries in which fragments of SEQ ID NO: 19 were 
expressed. For example, fragment 1519153H1 was expressed in a cDNA library (BLADTUT04) 
derived from bladder tumor tissue (for a description of this library, see Table 4, p. 65, row 2). 
These data support the assertion that SEQ ID NO: 19 may be useful in the diagnosis, treatment or 
prevention of this cell proliferative disorder by showing that SEQ ID NO: 19 is expressed in at 
least one cell proliferative disorder that of bladder cancer. 

The rejection of claims 3-6, 8, and 10-11 is improper, as the inventions of those 
claims have a patentable utility as set forth in the instant specification, and/or a utility well 
known to one of ordinary skill in the art. 

The invention at issue is a polynucleotide corresponding to a gene that is expressed in 
humans. The novel polynucleotide codes for a polypeptide demonstrated in the patent 
specification to be a member of the class of kinases, whose biological functions include 
phosphorylation of proteins. The claimed invention has numerous practical, beneficial uses in 
toxicology testing, drug development, and the diagnosis of disease, none of which requires 
knowledge of how the polypeptide coded for by the polynucleotide actually functions. 

Applicants submit with this brief the First Declaration of Bedilion describing some of the 
practical uses of the claimed invention in gene and protein expression monitoring applications. 
The First Bedilion Declaration demonstrates that the positions and arguments made by the Patent 
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Examiner with respect to the utility of the claimed polynucleotide are without merit. 

The First Bedilion Declaration describes, in particular, how the claimed expressed 

polynucleotide can be used in gene expression monitoring applications that were well-known at 

the time the patent application was filed, and how those applications are useful in developing 

drugs and monitoring their activity. Dr. Bedilion states that the claimed invention is a useful tool 

when employed as a highly specific probe in a cDNA microarray: 

Persons skilled in the art would have appreciated on March 18, 1999 that cDNA 
microarrays that contained the SEQ ID NO:5-encoding polynucleotides would be a more 
useful tool than cDNA microarrays that did not contain the polynucleotides in connection 
with conducting gene expression monitoring studies on proposed (or actual) drugs for 
treating neurological, cell proliferative, and autoimmune/inflammatory disorders for such 
purposes as evaluating their efficacy and toxicity. (First Bedilion Declaration, f 15.) 

Applicants further submit three additional expert Declarations under 37 C.F.R. § 1.132, 

with respective attachments, and ten (10) scientific references filed before the March 18, 1999 

priority date of the instant application. The First Bedilion Declaration, Rockett Declaration, Iyer 

Declaration, Second Bedilion Declaration, and the ten (10) references fully establish that, prior to 

the March 18, 1999 filing date of the parent application (Ser. No. 60/125,593, hereinafter the 

"Bandman '593 application"), it was well-established in the art that: 

polynucleotides derived from nucleic acids expressed in one or 
more tissues and/or cell types can be used as hybridization probes — that is, as 
tools ~ to survey for and to measure the presence, the absence, and the amount 
of expression of their cognate gene; 

with sufficient length, at sufficient hybridization stringency, and 
with sufficient wash stringency — conditions that can be routinely established ~ 
expressed polynucleotides, used as probes, generate a signal that is specific to 
the cognate gene, that is, produce a gene- specific expression signal; 

expression analysis is useful, inter alia, in drug discovery and 
lead optimization efforts, in toxicology, particularly toxicology studies 
conducted early in drug development efforts, and in phenotypic 
characterization and categorization of cell types, including neoplastic cell 
types; 

each additional gene-specific probe used as a tool in expression 
analysis provides an additional gene-specific signal that could not otherwise 
have been detected, giving a more comprehensive, robust, higher resolution, 
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statistically more significant, and thus more useful expression pattern in such 
analyses than would otherwise have been possible; 

biologists, such as toxicologists, recognize the increased utility 
of more comprehensive, robust, higher resolution, statistically more significant 
results, and thus want each newly identified expressed gene to be included in 
such an analysis; 

nucleic acid microarrays increase the parallelism of expression 
measurements, providing expression data analogous to that provided by older, 
lower throughput techniques, but at substantially increased throughput; 

accordingly, when expression profiling is performed using 
microarrays, each additional gene-specific probe that is included as a signaling 
component on this analytical device increases the detection range, and thus 
versatility, of this research tool; 

biologists, such as toxicologists, recognize the increased utility 
of such improved tools, and thus want a gene-specific probe to each newly 
identified expressed gene to be included in such an analytical device; 

the industrial suppliers of microarrays recognize the increased 
utility of such improved tools to their customers, and thus strive to improve 
salability of their microarrays by adding each newly identified expressed gene 
to the microarrays they sell; 

it is not necessary that the biological function of a gene be 
known for measurement of its expression to be useful in drug discovery and 
lead optimization analyses, toxicology, or molecular phenotyping experiments; 

failure of a probe to detect changes in expression of its cognate 
gene does not diminish the usefulness of the probe as a research tool; and 

failure of a probe completely to detect its cognate transcript in 
any single expression analysis experiment does not deprive the probe of 
usefulness to the community of users who would use it as a research tool. 

The Patent Examiner does not dispute that the claimed polynucleotide can be used as a 
probe in cDNA microarrays and used in gene expression monitoring applications. Instead, the 
Patent Examiner contends that the claimed polynucleotide cannot be useful without precise 
knowledge of its biological function, or the biological function of the polypeptide it encodes. 
But the law has never required knowledge of biological function to prove utility. It is the 
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claimed invention's uses, not its functions, that are the subject of a proper analysis under the 
utility requirement. 

In any event, as demonstrated by the First Bedilion Declaration, the Rockett Declaration, 
the Iyer Declaration, and the Second Bedilion Declaration, the person of ordinary skill in the art 
can achieve beneficial results from the claimed polynucleotide in the absence of any knowledge 
as to the precise function of the protein encoded by it. The uses of the claimed polynucleotide in 
gene expression monitoring applications are in fact independent of its precise biological function. 

I. The applicable legal standard 

To meet the utility requirement of sections 101 and 1 12 of the Patent Act, the patent 

applicant need only show that the claimed invention is "practically useful," Anderson v. Natta, 

480 F.2d 1392, 1397, 178 USPQ 458 (CCPA 1973) and confers a "specific benefit" on the 

public. Brenner v. Manson, 383 U.S. 519, 534-35, 148 USPQ 689 (1966). As discussed in a 

recent Court of Appeals for the Federal Circuit case, this threshold is not high: 

An invention is "useful" under section 101 if it is capable of providing some identifiable 
benefit. See Brenner v. Manson, 383 U.S. 519, 534 [148 USPQ 689] (1966); Brooktree 
Corp. v. Advanced Micro Devices, Inc., 977 F.2d 1555, 1571 [24 USPQ2d 1401] (Fed. 
Cir. 1992) ("to violate Section 101 the claimed device must be totally incapable of 
achieving a useful result"); Fuller v. Berger, 120 F. 274, 275 (7th Cir. 1903) (test for 
utility is whether invention "is incapable of serving any beneficial end"). 

Juicy Whip Inc. v. Orange Bang Inc., 51 USPQ2d 1700 (Fed. Cir. 1999). 

While an asserted utility must be described with specificity, the patent applicant need not 

demonstrate utility to a certainty. In Stiftung v. Renishaw PLC, 945 F.2d 1 173, 1 180, 

20 USPQ2d 1094 (Fed. Cir. 1991), the United States Court of Appeals for the Federal Circuit 

explained: 

An invention need not be the best or only way to accomplish a certain result, and it need 
only be useful to some extent and in certain applications: "[T]he fact that an invention has 
only limited utility and is only operable in certain applications is not grounds for finding 
lack of utility." Envirotech Corp. v. Al George, Inc., 730 F.2d 753, 762, 221 USPQ 473, 
480 (Fed. Cir. 1984). 

The specificity requirement is not, therefore, an onerous one. If the asserted utility is 
described so that a person of ordinary skill in the art would understand how to use the claimed 
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invention, it is sufficiently specific. See Standard Oil Co. v. Montedison, S.p.a., 212 U.S.P.Q. 
327, 343 (3d Cir. 1981). The specificity requirement is met unless the asserted utility amounts to 
a "nebulous expression" such as "biological activity" or "biological properties" that does not 
convey meaningful information about the utility of what is being claimed. Cross v. lizuka, 
753 F.2d 1040, 1048 (Fed. Cir. 1985). 

In addition to conferring a specific benefit on the public, the benefit must also be 
"substantial." Brenner, 383 U.S. at 534. A "substantial" utility is a practical, "real-world" utility. 
Nelson v. Bowler, 626 F.2d 853, 856, 206 USPQ 881 (CCPA 1980). 

If persons of ordinary skill in the art would understand that there is a "well-established" 
utility for the claimed invention, the threshold is met automatically and the applicant need not 
make any showing to demonstrate utility. Manual of Patent Examining Procedure at § 706.03(a). 
Only if there is no "well-established" utility for the claimed invention must the applicant 
demonstrate the practical benefits of the invention. Id, 

Once the patent applicant identifies a specific utility, the claimed invention is presumed 
to possess it. In re Cortright, 165 F.3d 1353, 1357, 49 USPQ2d 1464 (Fed. Cir. 1999); In re 
Brana, 51 F.3d 1560, 1566; 34 USPQ2d 1436 (Fed. Cir. 1995). In that case, the Patent Office 
bears the burden of demonstrating that a person of ordinary skill in the art would reasonably 
doubt that the asserted utility could be achieved by the claimed invention. Id. To do so, the 
Patent Office must provide evidence or sound scientific reasoning. See In re hanger, 503 F.2d 
1380, 1391-92, 183 USPQ 288 (CCPA 1974). If and only if the Patent Office makes such a 
showing, the burden shifts to the applicant to provide rebuttal evidence that would convince the 
person of ordinary skill that there is sufficient proof of utility. Brana, 51 F.3d at 1566. The 
applicant need only prove a "substantial likelihood" of utility; certainty is not required. Brenner, 
383 U.S. at 532. 

II. The uses of polynucleotides encoding HRIP for diagnosis of conditions or diseases 
characterized by expression of HRIP and for drug discovery are sufficient utilities 
under 35 U.S.C. §§ 101 and 112, first paragraph 

The claimed invention meets all of the necessary requirements for establishing a credible 
utility under the Patent Law: There are "well-established" uses for the claimed invention known 
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to persons of ordinary skill in the art, and there are specific practical and beneficial uses for the 
invention disclosed in the patent application's specification. These uses are explained, in detail, 
in the Bedilion Declaration accompanying this brief. Objective evidence, not considered by the 
Patent Office, further corroborates the credibility of the asserted utilities. 

A. The use of the claimed HRIP encoding polynucleotides for toxicology testing, 
drug discovery, and disease diagnosis are practical uses that confer "specific 
benefits" to the public 

The claimed invention has specific, substantial, real-world utility by virtue of its use in 
toxicology testing, drug development and disease diagnosis through gene expression profiling. 
These uses are explained in detail in the accompanying First Bedilion Declaration, Rockett 
Declaration, Iyer Declaration, and Second Bedilion Declaration, the substance of which is not 
rebutted by the Patent Examiner. There is no dispute that the claimed invention is in fact a useful 
tool in cDNA microarrays used to perform gene expression analysis. That is sufficient to 
establish utility for the claimed polynucleotide. 

The instant application is a U.S. National Stage of International Application No. 
PCT/US00/07277 and claims priority to a provisional application, Bandman et al., Ser. No. 
60/125,593, filed on March 18, 1999, (hereinafter "the Bandman '593 application"). 

In his first Declaration, Dr. Bedilion explains the many reasons why a person skilled in 
the art reading the Bandman 6 593 application on March 18, 1999 would have understood that 
application to disclose the claimed polynucleotide to be useful for a number of gene expression 
monitoring applications, e.g., as a highly specific probe for the expression of that specific 
polynucleotide in connection with the development of drugs and the monitoring of the activity of 
such drugs (Bedilion Declaration at, e.g., ff 10-15). Much, but not all, of Dr. Bedilion's 
explanation concerns the use of the claimed polynucleotide in cDNA microarrays of the type first 
developed at Stanford University for evaluating the efficacy and toxicity of drugs, as well as for 
other applications (First Bedilion Declaration at, e.g., fj 12 and 15). 1 



'Dr. Bedilion also explained, for example, why persons skilled in the art would also 
appreciate, based on the Bandman '593 specification, that the claimed polynucleotide would be 
useful in connection with developing new drugs using technology, such as Northern analysis, that 
predated by many years the development of the cDNA technology (First Bedilion Declaration, f 
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In connection with his explanations, Dr. Bedilion states that the "Bandman '593 
application would have led a person skilled in the art in March 1999 who was using gene 
expression monitoring in connection with working on developing new drugs for the treatment of 
neurological, cell proliferative, and autoimmune/inflammatory disorders [a] to conclude that a 
cDNA microarray that contained the SEQ ID NO:5-encoding polynucleotides would be a highly 
useful tool, and [b] to request specifically that any cDNA microarray that was being used for 
such purposes contain the SEQ ID NO:5-encoding polynucleotides" (Bedilion Declaration, f 15). 
For example, as explained by Dr. Bedilion, "[p]ersons skilled in the art would [have appreciated 
on March 18, 1999] that a cDNA microarray that contained the SEQ ID NO:5-encoding 
polynucleotides would be a more useful tool than a cDNA microarray that did not contain the 
polynucleotides in connection with conducting gene expression monitoring studies on proposed 
(or actual) drugs for treating neurological, cell proliferative, and autoimmune/inflammatory 
disorders for such purposes as evaluating their efficacy and toxicity." Id. 

In support of those statements, Dr. Bedilion provided detailed explanations of how cDNA 
technology can be used to conduct gene expression monitoring evaluations, with extensive 
citations to pre-March 18, 1999 publications showing the state of the art on March 18, 1999 
(First Bedilion Declaration, f f 10-14). While Dr. Bedilion's explanations in paragraph 15 of his 
Declaration include almost three pages of text and six] subparts (a)-(f), he specifically states that 
his explanations are not "all-inclusive." Id. For example, with respect to toxicity evaluations, 
Dr. Bedilion had earlier explained how persons skilled in the art who were working on drug 
development on March 18, 1999 (and for several years prior to March 18, 1999) "without any 
doubt" appreciated that the toxicity (or lack of toxicity) of any proposed drug was "one of the 
most important criteria to be considered and evaluated in connection with the development of the 
drug" and how the teachings of the Bandman '593 application clearly include using differential 
gene expression analyses in toxicity studies (First Bedilion Declaration, % 10). 

Thus, the First Bedilion Declaration establishes that persons skilled in the art reading the 
Bandman '593 application at the time it was filed "would have wanted their cDNA microarray 
to have a [SEQ ID NO:5-encoding polynucleotide] probe because a microarray that contained 
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such a probe (as compared to one that did not) would provide more useful results in the kind of 
gene expression monitoring studies using cDNA microarrays that persons skilled in the art have 
been doing since well prior to March 18, 1999" (First Bedilion Declaration, f 15, item (f)). This, 
by itself, provides more than sufficient reason to compel the conclusion that the Bandman '593 
application disclosed to persons skilled in the art at the time of its filing substantial, specific and 
credible real-world utilities for the claimed polynucleotide. 

In his Declaration, Dr. Rockett explains the many reasons why a person skilled in the art 
in 1997 would have understood that any expressed polynucleotide is useful for a number of gene 
expression monitoring applications, e.g., in cDNA microarrays, in connection with the 
development of drugs and the monitoring of the activity of such drugs. (Rockett Declaration at, 
e.g., ff 10-18). 

It is my opinion, therefore, based on the state of the art in toxicology at least since 
the mid-1990s . . . that disclosure of the sequence of a new gene or protein, with or 
without knowledge of its biological function, would have been sufficient information 
for a toxicologist to use the gene and/or protein in expression profiling studies in 
toxicology. 2 [Rockett Declaration,^ 18.] 

In his second Declaration, Dr. Bedilion explains why a person of skill in the art in 1997 
would have understood that any expressed polynucleotide is useful for gene expression 
monitoring applications using cDNA microarrays. (Second Bedilion Declaration, e.g., fj[ 4-7.) 
In his Declaration, Dr. Iyer explains why a person of skill in the art in 1997 would have 
understood that any expressed polynucleotide is useful for gene expression monitoring 
applications using cDNA microarrays, stating that "[t]o provide maximum versatility as a 
research tool, the microarray should include D and as a biologist I would want my microarray to 
include D each newly identified gene as a probe." (Iyer Declaration, % 9.) 

In addition, Dr. Rockett explains in his Declaration that "there are a number of other 
differential expression analysis technologies that precede the development of microarrays, some 
by decades, and that have been applied to drug metabolism and toxicology research, including: 

"Use of the words 'it is my opinion' to preface what someone of ordinary skill in the art 
would have known does not transform the factual statements contained in the declaration into 
opinion testimony." In re Alton, 37 USPQ2d 1578, 1583 (Fed. Cir. 1996). 
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(1) differential screening; (2) subtractive hybridization, including variants such as chemical 
cross-linking subtraction, suppression-PCR subtractive hybridization and representational 
difference analysis; (3) differential display; (4) restriction endonuclease facilitated analyses, 
including serial analysis of gene expression (SAGE) and gene expression fingerprinting and (5) 
EST analysis." (Rockett Declaration, f 7.) 

Nowhere does the Patent Examiner address the fact that, as described on, for example, 
page 34 of the Bandman '593 application, the claimed polynucleotides can be used as highly 
specific probes in, for example, cDNA microarrays D probes that without question can be used to 
measure both the existence and amount of complementary RNA sequences known to be the 
expression products of the claimed polynucleotides. The claimed invention is not, in that regard, 
some random sequence whose value as a probe is speculative or would require further research to 
determine. 

Given the fact that the claimed polynucleotide is known to be expressed, its utility as a 
measuring and analyzing instrument for expression levels is as indisputable as a scale's utility for 
measuring weight. This use as a measuring tool, regardless of how the expression level data 
ultimately would be used by a person of ordinary skill in the art, by itself demonstrates that the 
claimed invention provides an identifiable, real-world benefit that meets the utility requirement. 
Raytheon v. Roper, 724 F.2d 951, (Fed. Cir. 1983) (claimed invention need only meet one of its 
stated objectives to be useful); In re Cortwright, 165 F.3d 1353, 1359 (Fed. Cir. 1999) (how the 
invention works is irrelevant to utility); MPEP § 2107 ("Many research tools such as gas 
chromatographs, screening assays, and nucleotide sequencing techniques have a clear, specific, 
and unquestionable utility (e.g., they are useful in analyzing compounds )" (emphasis added)). 

The First Bedilion Declaration shows that a number of pre-March 18, 1999 publications 
confirm and further establish the utility of cDNA microarrays in a wide range of drug 
development gene expression monitoring applications at the time the Bandman '593 application 
was filed (First Bedilion Declaration 10-14; Bedilion Exhibits A-G). Indeed, Brown and 
Shalon U.S. Patent No. 5,807,522 (the Brown '522 patent, Bedilion Exhibit D), which issued 
from a patent application filed in June 1995 and was effectively published on December 29, 1995 
as a result of the publication of a PCT counterpart application, shows that the Patent Office 
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recognizes the patentable utility of the cDNA technology developed in the early to mid-1990s. 

As explained by Dr. Bedilion, among other things (First Bedilion Declaration, f 12): 

The Brown '522 patent further teaches that the "[m]icroarrays of immobilized 
nucleic acid sequences prepared in accordance with the invention" can be used in 
"numerous" genetic applications, including "monitoring of gene expression" 
applications (see Bedilion Tab D at col. 14, lines 36-42). The Brown '522 patent 
teaches (a) monitoring gene expression (i) in different tissue types, (ii) in different 
disease states, and (iii) in response to different drugs, and (b) that arrays disclosed 
therein may be used in toxicology studies (see Bedilion Tab D at col. 15, lines 13- 
18 and 52-58; and col. 18, lines 25-30). 

Literature reviews published shortly after the filing of the Bandman '593 application 

describing the state of the art further confirm the claimed invention's utility. Rockett et al. 

confirm, for example, that the claimed invention is useful for differential expression analysis 

regardless of how expression is regulated: 

Despite the development of multiple technological advances which have recently 
brought the field of gene expression profiling to the forefront of molecular 
analysis, recognition of the importance of differential gene expression and 
characterization of differentially expressed genes has existed for many years. 

* # * 

Although differential expression technologies are applicable to a broad range of 
models, perhaps their most important advantage is that, in most cases, absolutely 
no prior knowledge of the specific genes which are up- or down-regulated is 
required. 

* * * 

Whereas it would be informative to know the identity and functionality of all 
genes up/down regulated by . . . toxicants, this would appear a longer term goal 
.... However, the current use of gene profiling yields a pattern of gene changes 
for a xenobiotic of unknown toxicity which may be matched to that of well 
characterized toxins, thus alerting the toxicologist to possible in vivo similarities 
between the unknown and the standard, thereby providing a platform for more 
extensive toxicological examination, (emphasis in original) 

Rockett et al., Differential gene expression in drug metabolism and toxicology: practicalities, 
problems and potential Xenobiotica 29:655-691 (July 1999) (Reference No. 2). 
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In another pre-March 18, 1999 article, Lashkari et al. state explicitly that sequences that 

are merely "predicted" to be expressed (predicted Open Reading Frames, or ORFs) 0 the claimed 

invention in fact is known to be expressed D have numerous uses: 

Efforts have been directed toward the amplification of each predicted ORF or any 
other region of the genome ranging from a few base pairs to several kilobase 
pairs. There are many uses for these ampliconsD they can be cloned into standard 
vectors or specialized expression vectors, or can be cloned into other specialized 
vectors such as those used for two-hybrid analysis. The amplicons can also be 
used directly by, for example, arraying onto glass for expression analysis , for 
DNA binding assays, or for any direct DNA assay, (emphasis added) 

Lashkari et al., Whole genome analysis: Experimental access to all genome sequenced segments 
through larger-scale efficient oligonucleotide synthesis and PCR , Proc. Nat. Acad. Sci. 94:8945- 
8947 (Aug. 1997) (Reference No. 3). 

B. The use of polynucleotides coding for polypeptides expressed by humans as 
tools for toxicology testing, drug discovery, and the diagnosis of disease is 
now "well-established" 

The technologies made possible by expression profiling and the DNA tools upon which 
they rely are now well-established. The technical literature recognizes not only the prevalence of 
these technologies, but also their unprecedented advantages in drug development, testing and 
safety assessment. These technologies include toxicology testing, e.g., as described by Bedilion, 
Rockett, and Iyer in their Declarations. 

Toxicology testing is now standard practice in the pharmaceutical industry. See, e.g., 

John C. Rockett et al., supra: 

Knowledge of toxin-dependent regulation in target tissues is not solely an academic 
pursuit as much interest has been generated in the pharmaceutical industry to harness this 
technology in the early identification of toxic drug candidates, thereby shortening the 
developmental process and contributing substantially to the safety assessment of new 
drugs. (Reference No. 2, page 656) 

To the same effect are several other scientific publications, including Emile F. Nuwaysir et al., 
Microarrays and toxicology: The advent of toxicogenomics , Molecular Carcinogenesis 24:153- 
159 (1999) (Reference No. 4); Sandra Steiner and N. Leigh Anderson, Expression profiling in 
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toxicology - potentials and limitations . Toxicology Letters 112-13:467-471 (2000) (Reference 
No. 5). 

Nucleic acids useful for measuring the expression of whole classes of genes are routinely 

incorporated for use in toxicology testing. Nuwaysir et al. describes, for example, a Human 

ToxChip comprising 2089 human clones, which were selected 

for their well-documented involvement in basic cellular processes as well as their 
responses to different types of toxic insult. Included on this list are DNA replication and 
repair genes, apoptosis genes, and genes responsive to PAHs and dioxin-like compounds, 
peroxisome proliferators, estrogenic compounds, and oxidant stress. Some of the other 
categories of genes include transcription factors, oncogenes, tumor suppressor genes, 
cyclins, kinases, phosphatases, cell adhesion and motility genes, and homeobox genes. 
Also included in this group are 84 housekeeping genes, whose hybridization intensity is 
averaged and used for signal normalization of the other genes on the chip. 

See also Table 1 of Nuwaysir et al. (listing additional classes of genes deemed to be of special 

interest in making a human toxicology microarray). 

The more genes that are available for use in toxicology testing, the more powerful the 
technique. "Arrays are at their most powerful when they contain the entire genome of the species 
they are being used to study." John C. Rockett and David J. Dix, Application of DNA arrays to 
toxicology , Environ. Health Perspec. 107:68 1-685 (1999) (Reference No. 6). Control genes are 
carefully selected for their stability across a large set of array experiments in order to best study 
the effect of toxicological compounds. See attached email from the primary investigator on the 
Nuwaysir paper, Dr. Cynthia Afshari, to an Incyte employee, dated July 3, 2000, as well as the 
original message to which she was responding (Reference No. 7), indicating that even the 
expression of carefully selected control genes can be altered. Thus, there is no expressed gene 
which is irrelevant to screening for toxicological effects, and all expressed genes have a utility 
for toxicological screening. 

Further evidence of the well-established utility of all expressed polypeptides and 
polynucleotides in toxicology testing is found in U.S. Pat. No. 5,569,588 (Reference No. c) 
and published PCT applications WO 95/21944 (Reference No. a), WO 95/20681 (Reference 
No. b), and WO 97/13877 (Reference No. d). 

WO 95/21944 ("Differentially expressed genes in healthy and diseased subjects"), 
published August 17, 1995, describes the use of microarrays in expression profiling analyses, 
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emphasizing that patterns of expression can be used to distinguish healthy tissues from diseased 

tissues and that patterns of expression can additionally be used in drug development and 

toxicology studies, without knowledge of the biological function of the encoded gene product. 

In particular, and with emphasis added: 

The present invention involves . . . methods for diagnosing diseases . . . 
characterized by the presence of [differentially expressed] . . . genes, despite the 
absence of knowledge about the gene or its function . The methods involve the use 
of a composition suitable for use in hybridization which consists of a solid surface 
on which is immobilized at pre-defined regions thereon a plurality of defined 
oligonucleotide/ polynucleotide sequences for hybridization. Each sequence 
comprises a fragment of an EST . . . . Differences in hybridization patterns produced 
through use of this composition and the specified methods enable diagnosis of 
diseases based on differential expression of genes of unknown function . . . . 
[abstract] 

The method [of the present invention] involves producing and comparing 
hybridization patterns formed between samples of expressed mRNA or cDNA 
polynucleotide sequences . . . and a defined set of oligonucleotide/polynucleotidef] 
. . . immobilized on a support. Those defined [immobilized] 
oligonucleotide/polynucleotide sequences are representative of the total expressed 
genetic component of the cells , tissues, organs or organism as defined by the 
collection of partial cDNA sequences (ESTs). [page 2] 

The present invention meets the unfilled needs in the art by providing 
methods for the . . . use of gene fragments and genes, even those of unknown full 
length sequence and unknown function, which are differentially expressed in a 
healthy animal and in an animal having a specific disease or infection by use of 
ESTs derived from DNA libraries of healthy and/or diseased/infected animals, 
[page 4] 

Yet another aspect of the invention is that it provides ... a means for . . . 
monitoring the efficacy of disease treatment regimes including . . . toxicological 
effects thereof ." [page 4] 

It has been appreciated that one or more differentially identified EST or 
gene-specific oligonucleotide/polynucleotides define a pattern of differentially 
expressed genes diagnostic of a predisease, disease or infective state. A knowledge 
of the specific biological function of the EST is not required only that the EST[] 
identifies a gene or genes whose altered expression is associated reproducibly with 
the predisease, disease or infectious state, [page 4] 

As used herein, the term 'disease' or 'disease state' refers to any condition 
which deviates from a normal or standardized healthy state in an organism of the 
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same species in terms of differential expression of the organism's genes. . . 
[whether] of genetic or environmental origin, for example, an inherited disorder 
such as certain breast cancers. . . .[or] administration of a drug or exposure of the 
animal to another agent, e.g., nutrition, which affects gene expression, [page 5] 

As used herein, the term 'solid support' refers to any known substrate which 
is useful for the immobilization of large numbers of oligonucleotide/polynucleotide 
sequences by any available method . . . [and includes, inter alia,] nitrocellulose, . . . 
glass, silica. . . . [page 6] 

By 'EST' or 'Expressed Sequence Tag' is meant a partial DNA or cDNA 
sequence of about 150 to 500, more preferably about 300, sequential nucleotides. . . 
• [page 6] 

One or more libraries made from a single tissue type typically provide at 
least about 3000 different (i.e., unique) ESTs and potentially the full complement of 
all possible ESTs representing all cDNAs e.g., 50,000 100,000 in an animal such 
as a human , [page 7] 

The lengths of the defined oligonucleotide/ polynucleotides may be readily 
increased or decreased as desired or needed. . . . The length is generally guided by 
the principle that it should be of sufficient length to insure that it is on[] average 
only represented once in the population to be examined , [page 7] 

Comparing the . . . hybridization patterns permits detection of those defined 
oligonucleotide/ polynucleotides which are differentially expressed between the 
healthy control and the disease sample by the presence of differences in the 
hybridization patterns at pre-defined regions [of the solid support], [page 13] 

It should be appreciated that one does not have to be restricted in using 
ESTs from a particular tissue from which probe RNA or cDNA is obtained[;] rather 
any or all ESTs (known or unknown) may be placed on the support. Hybridization 
will be used Ttol form diagnostic patterns or to identify which particular EST is 
detected. For example, all known ESTs from an organism are used to produce a 
'master' solid support to which control sample and disease samples are alternately 
hybridized, [page 14] 

Diagnosis is accomplished by comparing the two hybridization patterns , 
wherein substantial differences between the first and second hybridization patterns 
indicate the presence of the selected disease or infection in the animal being tested. 
Substantially similar first and second hybridization patterns indicate the absence of 
disease or infection. This[,] like many of the foregoing embodiments [J may use 
known or unknown ESTs derived from many libraries, [page 18] 
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Still another intriguing use of this method is in the area of monitoring the 
effects of drugs on gene expression , both in laboratories and during clinical trials 
with animal[s], especially humans, [page 18] 

WO 95/20681 ("Comparative Gene Transcript Analysis"), filed in 1994 by 
Applicants' assignee and published August 3, 1995, has three issued U.S. counterparts: 
U.S. Pat. Nos. 5,840,484, issued November 24, 1998; 6,114,114, issued September 5, 2000; and 
6,303,297, issued October 16, 2001. 

The specification describes the use of transcript expression patterns, or "images", 
each comprising multiple pixels of gene-specific information, for diagnosis, for cellular 
phenotyping, and in toxicology and drug development efforts. The specification describes a 
plurality of methods for obtaining the requisite expression data - one of which is microarray 
hybridization — and equates the uses of the expression data from these disparate platforms. In 
particular, and with emphasis added: 

The invention provides a "method and system for quantifying the relative 
abundance of gene transcripts in a biological specimen. . . . [G]ene transcript 
imaging can be used to detect or diagnose a particular biological state, disease, or 
condition which is correlated to the relative abundance of gene transcripts in a 
given cell or population of cells. The invention provides a method for comparing 
the gene transcript image analysis from two or more different biological specimens 
in order to distinguish between the two specimens and identify one or more genes 
which are differentially expressed between the two specimens." [abstract] 

" TWle see each individual gene product as a 'pixel' of information which 
relates to the expression of that, and only that, gene . We teach herein [] methods 
whereby the individual 'pixels' of gene expression information can be combined 
into a single gene transcript 'image, 'in which each of the individual genes can be 
visualized simultaneously and allowing relationships between the gene pixels to be 
easily visualized and understood." [page 2] 

"The present invention avoids the drawbacks of the prior art by providing a 
method to quantify the relative abundance of multiple gene transcripts in a given 
biological specimen . . . . The method of the instant invention provides for detailed 
diagnostic comparisons of cell profiles revealing numerous changes in the 
expression of individual transcripts." [page 6] 

"High resolution analysis of gene expression be used directly as a diagnostic 
profile . ..." [page 7] 
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"The method is particularly powerful when more than 100 and preferably 
more than 1,000 gene transcripts are analyzed." [page 7] 

"The invention . . . includes a method of comparing specimens containing 
gene transcripts." [page 7] 

"The final data values from the first specimen and the further identified 
sequence values from the second specimen are processed to generate ratios of 
transcript sequences, which indicate the differences in the number of gene 
transcripts between the two specimens." [i.e., the results yield analogous data to 
microarrays] [page 8] 

"Also disclosed is a method of producing a gene transcript image analysis 
by first obtaining a mixture of mRNA, from which cDNA copies are made." [page 
8] 

"In a further embodiment, the relative abundance o the gene transcripts in 
one cell type or tissue is compared with the relative abundance of gene transcript 
numbers in a second cell type or tissue in order to identify the differences and 
similarities." [page 9] 

"In essence, the invention is a method and system for quantifying the 
relative abundance of gene transcripts in a biological specimen. The invention 
provides a method for comparing the gene transcript image from two or more 
different biological specimens in order to distinguish between the two specimens. . . 
. " [page 9] 

"[T]wo or more gene transcript images can be compared and used to detect 
or diagnose a particular biological state, disease, or condition which is correlated to 
the relative abundance of gene transcripts in a given cell or population of cells." 
[pages 9 10] 

"The present invention provides a method to compare the relative 
abundance of gene transcripts in different biological specimens. . . . This process is 
denoted herein as gene transcript imaging. The quantitative analysis of the relative 
abundance for a set of gene transcripts is denoted herein as 'gene transcript image 
analysis' or 'gene transcript frequency analysis'. The present invention allows one 
to obtain a profile for gene transcription in any given population of cells or tissue 
from any type of organism ." [page 11] 

"The invention has significant advantages in the fields of diagnostics, 
toxicology and pharmacology, to name a few." [page 12] 

"[G]ene transcript sequence abundances are compared against reference 
database sequence abundances including normal data sets for diseased and healthy 
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patients. The patent has the disease(s) with which the patient' sdataset most 
closely correlates ." [page 12] 

"For example, gene transcript frequency analysis can be used to different 
normal cells or tissues from diseased cells or tissues. . . [page 12] 

" In toxicology , . . . [g]ene transcript imaging provides highly detailed 
information on the cell and tissue environment, some of which would not be 
obvious in conventional, less detailed screening methods. The gene transcript 
image is a more powerful method to predict drug toxicity and efficacy . Similar 
benefits accrue in the use of this tool in pharmacology. ..." [page 12] 

" In an alternative embodiment , comparative gene transcript frequency 
analysis is used to differentiate between cancer cells which respond to an ti -cancer 
agents and those which do not respond." [page 12] 

"In a further embodiment, comparative gene transcript frequency analysis is 
used ... for the selection of better pharmacologic animal models." [page 14] 

"In a further embodiment, comparative gene transcript frequency analysis is 
used in a clinical setting to give a highly detailed gene transcript profile of a 
diseased state or condition." [page 14] 

" An alternate method of producing a gene transcript image includes the 
steps of obtaining a mixture of test mRNA and providing a representative array of 
unique probes whose sequences are complementary to at least some of the test 
mRNAs. Next, a fixed amount of the test mRNA is added to the arrayed probes. 
The test mRNA is incubated with the probes for a sufficient time to allow hybrids 
of the test mRNA and probes to form. The mRNA -probe hybrids are detected and 
the quantity determined ." [page 15] 

" fTlhis research tool provides a way to get new drugs to the public faster 
and more economically." [page 36] 

" In this method, the particular physiologic function of the protein transcript 
need not be determined to qualify the gene transcript as a clinical marker." [page 
38] 

"[T]he gene transcript changes noted in the earlier rat toxicity study are 
carefully evaluated as clinical markers in the followed patients. Changes in the 
gene transcript image analyses are evaluated as indicators of toxicity by correlation 
with clinical signs and symptoms and other laboratory results. . . . The . . . analysis 
highlights any toxicological changes in the treated patients." [page 39] 
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U.S. Pat. No. 5,569,588 ("Methods for Drug Screening") ("the '588 patent"), 
issued October 29, 1996, with a priority date of August 1995, describes an expression profiling 
platform, the "genome reporter matrix", which is different from nucleic acid microarrays. 
Additionally describing use of nucleic acid microarrays, the patent makes clear that the utility of 
comparing multidimensional expression datasets is independent of the methods by which such 
profiles are obtained. The patent speaks clearly to the usefulness of such expression analyses in 
drug development and toxicology, particularly pointing out that a gene's failure to change in 
expression level is a useful result. Thus, with emphasis added, 

The invention provides "[m]ethods and compositions for modeling the 
transcriptional responsiveness of an organism to a candidate drug. . . . [The final 
step of the method comprises] comparing reporter gene product signals for each cell 
before and after contacting the cell with the candidate drug to obtain a drug 
response profile which provides a model of the transcriptional responsiveness of 
said organism to the candidate drug." [abstract] 

"The present invention exploits the recent advances in genome science to 
provide for the rapid screening of large numbers of compounds against a systemic 
target comprising substantially all targets in a pathway [or] organism ." [col. 1] 

"The ensemble of reporting cells comprises as comprehensive a collection 
of transcription regulatory genetic elements as is conveniently available for the 
targeted organism so as to most accurately model the systemic transcriptional 
response. Suitable ensembles generally comprise thousands of individually 
reporting elements; preferred ensembles are substantially comprehensive, i.e. 
provide a transcriptional response diversity comparable to that of the target 
organism. Generally, a substantially comprehensive ensemble requires transcription 
regulatory genetic elements from at least a majority of the organism's genes, and 
preferably includes those of all or nearly all of the genes . We term such a 
substantially comprehensive ensemble a genome reporter matrix." [col. 2] 

"Drugs often have side effects that are in part due to the lack of target 
specificity. . . . [A] genome reporter matrix reveals the spectrum of other genes in 
the genome also affected by the compound. In considering two different 
compounds both of which induce the ERG 10 reporter, if one compound affects the 
expression of 5 other reporters and a second compound affects the expression of 50 
other reports, the first compound is, a priori, more likely to have fewer side 
effects." [cols. 2-3] 

"Furthermore, it is not necessary to know the identity of any of the 
responding genes ." [col. 3] 
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M [A]ny new compound that induces the same response profile as [a] . . . 
dominant tubulin mutant would provide a candidate for a taxol-like 
pharmaceutical." [col. 4] 

"The genome reporter matrix offers a simple solution to recognizing new 
specificities in combinatorial libraries. Specifically, pools of new compounds are 
tested as mixtures across the matrix. If the pool has any new activity not present in 
the original lead compound, new genes are affected among the reporters." [col. 4] 

" A sufficient number of different recombinant cells are included to provide 
an ensemble of transcriptional regulatory elements of said organism sufficient to 
model the transcriptional responsiveness of said organism to a drug. In a preferred 
embodiment, the matrix is substantially comprehensive for the selected regulatory 
elements, e.g. essentially all of the gene promoters of the targeted organism are 
included." [cols. 6 7] 

"In a preferred embodiment, the basal response profiles are determined. . . . 
The resultant electrical output signals are stored in a computer memory as genome 
reporter output signal matrix data structure associating each output signal with the 
coordinates of the corresponding microtiter plate well and the stimulus or drug. 
This information is indexed against the matrix to form reference response profiles 
that are used to determine the response of each reporter to any milieu in which a 
stimulus may be provided. After establishing a basal response profile for the 
matrix, each cell is contacted with a candidate drug. The term drug is used loosely 
to refer to agents which can provoke a specific cellular response. . . . The drug 
induces a complex response pattern of repression, silence and induction across the 
matrix . . . .The response profile reflects the cell's transcriptional adjustments to 
maintain homeostasis in the presence of the drug. . . . After contacting the cells with 
the candidate drug, the reporter gene product signals from each of said cells is again 
measured to determine a stimulated response profile. The basal o[r] background 
response profile is then compared with ... the stimulated response profile to 
identify the cellular response profile to the candidate drug." [cols. 7 8] 

" In another embodiment of the invention , a matrix [i.e., array] of 
hybridization probes corresponding to a predetermined population of genes of the 
selected organism is used to specifically detect changes in gene transcription which 
result from exposing the selected organism or cells thereof to a candidate drug. In 
this embodiment, one or more cells derived from the organism is exposed to the 
candidate drug in vivo or ex vivo under conditions wherein the drug effects a 
change in gene transcription in the cell to maintain homeostasis. Thereafter, the 
gene transcripts, primarily mRNA, of the cell or cells is isolated . . . [and] then 
contacted with an ordered matrix [array] of hybridization probes, each probe being 
specific for a different one of the transcripts, under conditions where each of the 
transcripts hybridizes with a corresponding one of the probes to form hybridization 
pairs. The ordered matrix of probes provides, in aggregate, complements for an 



116425 



27 



09/937,060 



Docket No.: PF-0683 USN 

ensemble of genes of the organism sufficient to model the transcriptional 
responsiveness of the organism to a drug. . . . The matrix-wide signal profile of the 
drug-stimulated cells is then compared with a matrix-wide signal profile of negative 
control cells to obtain a specific drug response profile." [col. 8] 

"The invention also provides means for computer-based qualitative analysis 
of candidate drugs and unknown compounds. A wide variety of reference response 
profiles may be generated and used in such analyses." [col. 8] 

" Response profiles for an unknown stimulus (e.g. new chemicals, unknown 
compounds or unknown mixtures) may be analyzed by comparing the new stimulus 
response profiles with response profiles to known chemical stimuli ." [col. 9] 

"The response profile of a new chemical stimulus may also be compared to 
a known genetic response profile for target gene(s)." [col. 9] 

The August 11, 1997 press release from the '588 patent's assignee, Acacia 
Biosciences (now part of Merck) (reference "h" attached hereto), and the September 15, 1997 
news report by Glaser, "Strategies for Target Validation Streamline Evaluation of Leads," 
Genetic Engineering News (reference "i" attached hereto), attest the commercial value of the 
methods and technology described and claimed in the '588 patent. 

WO 97/13877 ("Measurement of Gene Expression Profiles in Toxicity 
Determinations"), published April 17, 1997; describes an expression profiling technology 
differing somewhat from the use of cDNA microarrays and differing from the genome reporter 
matrix of the '588 patent; but the use of the data is analogous. As per its title, the reference 
describes use of expression profiling in toxicity determinations. In particular, and with emphasis 
added: 

"[T]he invention relates to a method for detecting and monitoring changes 
in gene expression patterns in in vitro and in vivo systems for determining the 
toxicity of drug candidates." [Field of the invention] 

"An object of the invention is to provide a new approach to toxicity 
assessment based on an examination of gene expression patterns, or profiles , in in 
vitro or in vivo test systems." [page 3] 

"Another object of the invention is to provide a rapid and reliable method 
for correlating gene expression with short term and long term toxicity in test 
animals." [page 3] 
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"The invention achieves these and other objects by providing a method for 
massively parallel signature sequencing of genes expressed in one or more selected 
tissues of an organism exposed to a test compound. An important feature of the 
invention is the application of novel . . . methodologies that permit the formation of 
gene expression profiles for selected tissues .... Such profiles may be compared 
with those from tissues of control organisms at single or multiple time points to 
identify expression patterns predictive of toxicity ." [page 3] 

"As used herein, the terms 'gene expression profile,' and 'gene expression 
pattern' which is used equivalently, means a frequency distribution of sequences of 
portions of cDNA molecules sampled from a population of tag-cDNA conjugates. . 
.. Preferably, the total number of sequences determined is at least 1000; more 
preferably, the total number of sequences determined in a gene expression profile is 
at least ten thousand ." [page 7] 

"The invention provides a method for determining the toxicity of a 
compound by analyzing changes in the gene expression profiles in selected tissues 

of test organisms exposed to the compound Gene expression profiles derived 

from test organisms are compared to gene expression profiles derived from control 
organisms. ..." [page 7] 

Therefore, the potential benefit to the public, in terms of lives saved and reduced health 

care costs, are enormous. Evidence of the benefits of this information include: 

D In 1999, CV Therapeutics, an Incyte collaborator, was able to use Incyte gene 
expression technology, information about the structure of a known transporter 
gene, and chromosomal mapping location, to identify the key gene associated 
with Tangiers disease. This discovery took place over a matter of only a few 
weeks, due to the power of these new genomics technologies. The discovery 
received an award from the American Heart Association as one of the top 10 
discoveries associated with heart disease research in 1999. 

D In an April 9, 2000, article published by the Bloomberg news service, an Incyte 
customer stated that it had reduced the time associated with target discovery and 
validation from 36 months to 18 months, through use of Incyte' s genomic 
information database. Other Incyte customers have privately reported similar 
experiences. The implications of this significant saving of time and expense for 
the number of drugs that may be developed and their cost are obvious. 

D In a February 10, 2000, article in the Wall Street Journal, one Incyte customer 

stated that over 50 percent of the drug targets in its current pipeline were derived 
from the Incyte database. Other Incyte customers have privately reported similar 
experiences. By doubling the number of targets available to pharmaceutical 
researchers, Incyte genomic information has demonstrably accelerated the 



116425 



29 



09/937,060 



Docket No.: PF-0683 USN 

development of new drugs. 

Because the Patent Examiner failed to address or consider the "well-established" utilities 
for the claimed invention in toxicology testing, drug development, and the diagnosis of disease, 
the Examiner's rejections should be overturned regardless of their merit. 

C. The similarity of the polypeptide encoded by the claimed invention to 
another polypeptide of undisputed utility demonstrates utility 

In addition to having substantial, specific and credible utilities in numerous gene 
expression monitoring applications, the utility of the claimed polynucleotide can be imputed 
based on the relationship between the polypeptide it encodes, HRIP, and another polypeptide of 
unquestioned utility, sphingosine kinase. The two polypeptides have sufficient similarities in 
their sequences that a person of ordinary skill in the art would recognize more than a reasonable 
probability that the polypeptide encoded for by the claimed invention has utility similar to 
sphingosine kinase. Applicants need not show any more to demonstrate utility. In re Brana, 51 
F.3d at 1567. 

It is undisputed, and readily apparent from the patent application, that the polypeptide 
encoded for by the claimed polynucleotide shares more than 80% sequence identity over 384 
amino acid residues with mouse sphingosine kinase (g3659694). Furthermore, an alignment of 
SEQ ID NO:5 with a post-filing human sphingosine kinase shows that the two sequences are 
approximately 99% identical over the entire 384 amino acid residue length of both sequences. In 
addition, a diacylglycerol kinase catalytic domain was identified by searching for statistically 
significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein 
families/domains. Sequence analysis of several families of kinases suggests that diacylglycerol 
kinases and sphingosine kinases are members of the same superfamily of genes due to a common 
domain (Labesse et ai, Trends Biochem. Sci. 27:273-5, 2002; Reference No. 8). In an earlier 
report, Kohama et al state that "the CI and C3 subdomains of sphingosine kinase show high 
amino acid similarity to residues 296-315 and 378-389 of human diacylglycerol kinase ( with 
35% and 58% identity, respectively" (JBC 273:23722-8, 1998). Taken together, these data 
suggest that there is more than enough homology to demonstrate a reasonable probability that the 
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utility of sphingosine kinase can be imputed to the claimed invention (through the polypeptide it 
encodes). It is well-known that the probability that two unrelated polypeptides share more than 
40% sequence homology over 70 amino acid residues is exceedingly small. (Brenner et al., Proc. 
Natl. Acad. Sci. 95:6073-78 (1998); Reference No. 9.) Given homology in excess of 40% over 
many more than 70 amino acid residues, the probability that the polypeptide encoded for by the 
claimed polynucleotide is related to mouse sphingosine kinase is, accordingly, very high. 

The Examiner must accept the Applicants' demonstration that the homology between the 
polypeptide encoded for by the claimed invention and mouse sphingosine kinase demonstrates 
utility by a reasonable probability unless the Examiner can demonstrate through evidence or 
sound scientific reasoning that a person of ordinary skill in the art would doubt utility. See In re 
hanger, 503 F.2d 1380, 1391-92, 183 USPQ 288 (CCPA 1974). The Examiner has not provided 
sufficient evidence or sound scientific reasoning to the contrary. 

D. Objective evidence corroborates the utilities of the claimed invention 

There is, in fact, no restriction on the kinds of evidence a Patent Examiner may consider 
in determining whether a "real-world" utility exists. "Real-world" evidence, such as evidence 
showing actual use or commercial success of the invention, can demonstrate conclusive proof of 
utility. Raytheon v. Roper, 220 USPQ2d 592 (Fed. Cir. 1983); Nestle v. Eugene, 55 F.2d 854, 
856, 12 USPQ 335 (6th Cir. 1932). Indeed, proof that the invention is made, used or sold by any 
person or entity other than the patentee is conclusive proof of utility. United States Steel Corp. 
v. Phillips Petroleum Co., 865 F.2d 1247, 1252, 9 USPQ2d 1461 (Fed. Cir. 1989). 

Over the past several years, a thriving market has developed for databases containing the 
sequences of all expressed genes (along with the polypeptide translations of those genes), in 
particular genes having medical and pharmaceutical significance such as the instant sequence. 
(Note that the value in these databases is enhanced by their completeness, but each sequence in 
them is independently valuable.) The databases sold by Applicants' assignee, Incyte, include 
exactly the kinds of information made possible by the claimed invention, such as tissue and 
disease associations. Incyte sells its database containing the claimed sequence and millions of 
other sequences throughout the scientific community, including to pharmaceutical companies 
who use the information to develop new pharmaceuticals. 
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Both Incyte's customers and the scientific community have acknowledged that Incyte's 
databases have proven to be valuable in, for example, the identification and development of drug 
candidates. Page et al., in discussing the identification and assignment of candidate drug targets, 
state that "rapid identification and assignment of candidate targets and markers represents a huge 
challenge ... [t]he process of annotation is similarly aided by the quantity and richness of the 
sequence specific databases that are currently available, both in the public domain and in the 
private sector (e.g. those supplied by Incyte Pharmaceuticals)" Page, M.J. et al., "Proteomics: a 
major new technology for the drug discovery process," Drug Discov. Today 4:55-62 (1999) 
(Reference No. 10), see page 58, col. 2). As Incyte adds information to its databases, including 
the information that can be generated only as a result of Incyte's invention of the claimed 
polynucleotide and its use of that polynucleotide on cDNA microarrays, the databases become 
even more powerful tools. Thus the claimed invention adds more than incremental benefit to the 
drug discovery and development process. 

Customers can, moreover, purchase the claimed polynucleotide directly from Incyte, 
saving the customer the time and expense of isolating and purifying or cloning the 
polynucleotide for research uses such as those described supra. 
> 

/ III. The Patent Examiner's rejections are without merit 

Rather than responding to the evidence demonstrating utility, the Examiner attempts to 
dismiss it altogether by arguing that "disclosure that a protein is a kinase without a more specific 
recitation of what type of kinase (i.e., what compound(s) is phosphorylated)", the disclosed and 
well-established utilities for the claimed polynucleotide are not specific or substantial utilities 
(Office Action at page 7). The Examiner is incorrect both as a matter of law and as a matter of 
fact. 

A. The precise biological role or function of an expressed polynucleotide is not 
required to demonstrate utility 

The Patent Examiner's primary rejection of the claimed invention is based on the ground 
that, without information as to the precise "biological role" of the claimed invention, the claimed 
invention's utility is not sufficiently specific. According to the Examiner, it is not enough that a 
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person of ordinary skill in the art could use and, in fact, would want to use the claimed invention 
either by itself or in a cDNA microarray to monitor the expression of genes for such applications 
as the evaluation of a drug's efficacy and toxicity. The Examiner would require, in addition, that 
the applicant provide a specific and substantial interpretation of the results generated in any 
given expression analysis. 

It may be that specific and substantial interpretations and detailed information on 
biological function are necessary to satisfy the requirements for publication in some technical 
journals, but they are not necessary to satisfy the requirements for obtaining a United States 
patent. The relevant question is not, as the Examiner would have it, whether it is known how or 
why the invention works, In re Cortwright, 165 F.3d 1353, 1359 (Fed. Cir. 1999), but rather 
whether the invention provides an "identifiable benefit" in presently available form. Juicy Whip 
Inc. v. Orange Bang Inc., 185 F.3d 1364, 1366 (Fed. Cir. 1999). If the benefit exists, and there is 
a substantial likelihood the invention provides the benefit, it is useful. There can be no doubt, 
particularly in view of the Bedilion Declaration (at, e.g., 10 and 15), that the present invention 
meets this test. 

The threshold for determining whether an invention produces an identifiable benefit is 
low. Juicy Whip, 185 F.3d at 1366. Only those utilities that are so nebulous that a person of 
ordinary skill in the art would not know how to achieve an identifiable benefit and, at least 
according to the PTO guidelines, so-called "throwaway" utilities that are not directed to a person 
of ordinary skill in the art at all, do not meet the statutory requirement of utility. Utility 
Examination Guidelines, 66 Fed. Reg. 1092 (Jan. 5, 2001). 

Knowledge of the biological function or role of a biological molecule has never been 

required to show real-world benefit. In its most recent explanation of its own utility guidelines, 

the PTO acknowledged as much (66 F.R. at 1095): 

[T]he utility of a claimed DNA does not necessarily depend on the function of the 
encoded gene product. A claimed DNA may have specific and substantial utility 
because, e.g., it hybridizes near a disease-associated gene or it has gene-regulating 
activity. 

By implicitly requiring knowledge of biological function for any claimed nucleic acid, 
the Examiner has, contrary to law, elevated what is at most an evidentiary factor into an absolute 
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requirement of utility. Rather than looking to the biological role or function of the claimed 
invention, the Examiner should have looked first to the benefits it is alleged to provide. 

B. Membership in a class of useful products can be proof of utility 

Despite the evidence that the claimed polynucleotide encodes a polypeptide in the kinase 
family, the Examiner refused to impute the utility of the members of the kinase family to HRIP. 
In the Office Action, the Patent Examiner takes the position that, unless Applicants can identify 
which particular biological function within the class of kinases is possessed by HRIP (i.e., the 
substrate of HRIP), utility cannot be imputed. To demonstrate utility by membership in the class 
of kinases, the Examiner would require that all kinases possess a "common" utility. The 
Examiner is incorrect both as a matter of fact and of law. 

There is no such requirement in the law. In order to demonstrate utility by membership 
in a class, the law requires only that the class not contain a substantial number of useless 
members. So long as the class does not contain a substantial number of useless members, there 
is sufficient likelihood that the claimed invention will have utility, and a rejection under 
35 U.S.C. § 101 is improper. That is true regardless of how the claimed invention ultimately is 
used and whether or not the members of the class possess one utility or many. See Brenner v. 
Manson, 383 U.S. 519, 532 (1966); Application of Kirk, 376 F.2d 936, 943 (CCPA 1967). 

Membership in a "general" class is insufficient to demonstrate utility only if the class 
contains a sufficient number of useless members such that a person of ordinary skill in the art 
could not impute utility by a substantial likelihood. There would be, in that case, a substantial 
likelihood that the claimed invention is one of the useless members of the class. In the few cases 
in which class membership did not prove utility by substantial likelihood, the classes did in fact 
include predominately useless members. E.g., Brenner (man-made steroids); Kirk (same); Natta 
(man-made polyethylene polymers). 

The Examiner addresses HRIP as if the general class in which it is included is not the 
kinase family, but rather all polynucleotides or all polypeptides, including the vast majority of 
useless theoretical molecules not occurring in nature, and thus not pre-selected by nature to be 
useful. While these "general classes" may contain a substantial number of useless members, the 
kinase family does not. The kinase family is sufficiently specific to rule out any reasonable 
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possibility that HRIP would not also be useful like the other members of the family. 

Because the Examiner has not presented any evidence that the kinase family has any, let 
alone a substantial number, of useless members, the Examiner must conclude that there is a 
"substantial likelihood" that the HRIP encoded by the claimed polynucleotide is useful. It 
follows that the claimed polynucleotide also is useful. 

Even if the Examiner's "common utility" criterion were correct - and it is not - the kinase 
family would meet it. It is undisputed that known members of the kinase family phosphorylate 
proteins. A person of ordinary skill in the art need not know any more about how or what the 
claimed invention phosphorylates to use it, and the Examiner presents no evidence to the 
contrary. Instead, the Examiner makes the conclusory observation that a person of ordinary skill 
in the art would need to know the substrate of any given kinase. 

Not so. As demonstrated by Applicants, knowledge that HRIP is a kinase is more than 
sufficient to make it useful for the diagnosis and treatment of neurological, cell proliferative, and 
autoimmune/inflammatory disorders. Indeed, HRIP has been shown to be expressed in cells 
undergoing proliferation or involved in inflammation (see the specification at, for example, p. 
62). The Examiner must accept these facts to be true unless the Examiner can provide evidence 
or sound scientific reasoning to the contrary. But the Examiner has not done so. 

IV. By requiring the patent applicant to assert a particular or unique utility, the 
Patent Examination Utility Guidelines and Training Materials applied by the 
Patent Examiner misstate the law 

There is an additional, independent reason to overturn the rejections: to the extent the 
rejections are based on Revised Interim Utility Examination Guidelines (64 FR 71427, 
December 21, 1999), the final Utility Examination Guidelines (66 FR 1092, January 5, 2001) 
and/or the Revised Interim Utility Guidelines Training Materials (USPTO Website 
www.uspto.gov, March 1, 2000), the Guidelines and Training Materials are themselves 
inconsistent with the law. 

The Training Materials, which direct the Examiners regarding how to apply the Utility 
Guidelines, address the issue of specificity with reference to two kinds of asserted utilities: 
"specific" utilities which meet the statutory requirements, and "general" utilities which do not. 
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The Training Materials define a "specific utility" as follows: 

A [specific utility] is specific to the subject matter claimed. This contrasts to general 
utility that would be applicable to the broad class of invention. For example, a claim to a 
polynucleotide whose use is disclosed simply as "gene probe" or "chromosome marker" 
would not be considered to be specific in the absence of a disclosure of a specific DNA 
target. Similarly, a general statement of diagnostic utility, such as diagnosing an 
unspecified disease, would ordinarily be insufficient absent a disclosure of what condition 
can be diagnosed. 

The Training Materials distinguish between "specific" and "general" utilities by assessing 
whether the asserted utility is sufficiently "particular," i.e., unique (Training Materials at page 
52) as compared to the "broad class of invention." (In this regard, the Training Materials appear 
to parallel the view set forth in Stephen G. Kunin, Written Description Guidelines and Utility 
Guidelines , 82 J.P.T.O.S. 77, 97 (Feb. 2000) ("With regard to the issue of specific utility the 
question to ask is whether or not a utility set forth in the specification is particular to the claimed 
invention.")). 

Such "unique" or "particular" utilities never have been required by the law. To meet the 
utility requirement, the invention need only be "practically useful," Nafta, 480 F.2d 1 at 1397, 
and confer a "specific benefit" on the public. Brenner, 383 U.S. at 534. Thus, incredible "throw- 
away" utilities, such as trying to "patent a transgenic mouse by saying it makes great snake 
food," do not meet this standard. Karen Hall, Genomic Warfare , The American Lawyer 68 (June 
2000) (quoting John Doll, Chief of the Biotech Section of USPTO). 

This does not preclude, however, a general utility, contrary to the statement in the 
Training Materials where "specific utility" is defined (page 5). Practical real-world uses are not 
limited to uses that are unique to an invention. The law requires that the practical utility be 
"definite," not particular. Montedison, 664 F.2d at 375. Applicants are not aware of any court 
that has rejected an assertion of utility on the grounds that it is not "particular" or "unique" to the 
specific invention. Where courts have found utility to be too "general," it has been in those cases 
in which the asserted utility in the patent disclosure was not a practical use that conferred a 
specific benefit. That is, a person of ordinary skill in the art would have been left to guess as to 
how to benefit at all from the invention. In Kirk, for example, the CCPA held the assertion that a 
man-made steroid had "useful biological activity" was insufficient where there was no informa- 
tion in the specification as to how that biological activity could be practically used. Kirk, 376 
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F.2d at 941. 

The fact that an invention can have a particular use does not provide a basis for requiring 
a particular use. See Brana, supra (disclosure describing a claimed antitumor compound as 
being homologous to an antitumor compound having activity against a "particular" type of 
cancer was determined to satisfy the specificity requirement). "Particularity" is not and never has 
been the sine qua non of utility; it is, at most, one of many factors to be considered. 

As described supra, broad classes of inventions can satisfy the utility requirement so long 
as a person of ordinary skill in the art would understand how to achieve a practical benefit from 
knowledge of the class. Only classes that encompass a significant portion of nonuseful members 
would fail to meet the utility requirement. Supra § III.B.2 (Montedison, 664 F.2d at 374-75). 

The Training Materials fail to distinguish between broad classes that convey information 
of practical utility and those that do not, lumping all of them into the latter, unpatentable 
category of "general" utilities. As a result, the Training Materials paint with too broad a brush. 
Rigorously applied, they would render unpatentable whole categories of inventions that 
heretofore have been considered to be patentable and that have indisputably benefitted the public, 
including the claimed invention. See supra § II.B. Thus the Training Materials cannot be 
applied consistently with the law. 

V. To the extent the rejection of the claimed invention under 35 U.S.C. § 112, first 
paragraph, is based on the improper rejection for lack of utility under 35 U.S.C. 
§ 101, it must be reversed. 

The rejection set forth in the Office Action is based on the assertions discussed above, 
i.e., that the claimed invention lacks patentable utility. To the extent that the rejection under 35 
U.S.C. § 1 12, first paragraph, is based on the improper allegation of lack of patentable utility 
under 35 U.S.C. § 101, it fails for the same reasons. 

VI. Summary 

Applicants respectfully submit that rejections for lack of utility based, inter alia, on an 
allegation of "lack of specificity," as set forth in the Office Action and as justified in the Revised 
Interim and final Utility Guidelines and Training Materials, are not supported in the law. Neither 
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are they scientifically correct, nor supported by any evidence or sound scientific reasoning. 
These rejections are alleged to be founded on facts in court cases such as Brenner and Kirk, yet 
those facts are clearly distinguishable from the facts of the instant application, and indeed most if 
not all nucleotide and protein sequence applications. Nevertheless, the PTO is attempting to 
mold the facts and holdings of these prior cases, "like a nose of wax," 3 to target rejections of 
claims to polypeptide and polynucleotide sequences, as well as to claims to methods of detecting 
said polynucleotide sequences, where biological activity information has not been proven by 
laboratory experimentation, and they have done so by ignoring perfectly acceptable utilities fully 
disclosed in the specifications as well as well-established utilities known to those of skill in the 
art. As is disclosed in the specification, and even more clearly, as one of ordinary skill in the art 
would understand, the claimed invention has well-established, specific, substantial and credible 
utilities. The rejections are, therefore, improper and should be reversed. 

Enablement rejection under 35 U.S.C. §112, 1 st paragraph 

Claims 3, 5-6, 8, and 10-1 1 are rejected under 35 U.S.C. §1 12, first paragraph, as 
allegedly containing subject matter which was not described in the specification in such a way as 
to enable one skilled in the art to make and/or use the invention. The Examiner asserts that the 
specification does not support the breadth of the claims with respect to naturally occurring 
variants, biologically active fragments, immunologically active fragments, and polynucleotide 
fragments of at least 60 nucleotides. (Applicants note that claim 1 1 has been amended and now 
recites fragments of at least 500 nucleotides.) Applicants traverse this rejection for at least the 
reasons below. 

As set forth in In re Marzocchi, 169 USPQ 367, 369 (CCPA 1971): 

The first paragraph of § 112 requires nothing more than objective enablement. 
How such a teaching is set forth, either by the use of illustrative examples or by 
broad terminology, is of no importance. 



3 "The concept of patentable subject matter under §101 is not 'like a nose of wax which 
may be turned and twisted in any direction * * White v. Dunbar, 1 19 U.S. 47, 51." (Parker v. 
FlooK 198 USPQ 193 (US SupCt 1978)) 
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As a matter of Patent Office practice, then, a specification disclosure which 
contains a teaching of the manner and process of making and using the invention 
in terms which correspond in scope to those used in describing and defining the 
subject matter sought to be patented must be taken as in compliance with the 
enabling requirement of the first paragraph of § 112 unless there is reason to 
doubt the objective truth of the statements contained therein which must be relied 
on for enabling support. 

Applicants submit that the disclosure amply enables the claimed invention. Given the 
sequences of SEQ ID NO:5 and SEQ ED NO: 19, one of ordinary skill in the art could readily 
identify a polynucleotide encoding a polypeptide comprising a naturally occurring amino acid 
sequence at least 90% identical to an amino acid sequence of SEQ ED NO: 5 or a polynucleotide 
comprising a naturally occurring polynucleotide sequence at least 90% identical to a 
polynucleotide sequence of SEQ ID NO: 19, using well known methods of sequence analysis 
without any undue experimentation. For example, the identification of relevant polynucleotides 
could be performed by hybridization and/or PCR techniques that were well-known to those 
skilled in the art at the time the subject application was filed and/or described throughout the 
Specification of the instant application. See, e.g., p. 42, lines 17-24; and Example VI at p. 52, 
lines 5-21. Thus, one skilled in the art need not make and test vast numbers of polynucleotides. 
Instead, one skilled in the art need only screen a cDNA library or use appropriate PCR conditions 
to identify relevant polynucleotides that already exist in nature. The skilled artisan would also 
know how to use the claimed polynucleotides, for example in expression profiling, disease 
diagnosis, or detection of related sequences as discussed above. 

The specification also describes the expression vectors into which the claimed variants 
and fragments could be inserted, and the construction of fusion proteins (pages 31-32). The 
specification describes, binding assays to detect molecular interactions of "HRIP or biologically 
active fragments thereof on page 56; and immunological methods for detecting and measuring 
HRIP on, for example, page 56. These methods could be used to detect and characterize peptide 
variants and fragments of SEQ ED NO:5. Given this guidance, one of ordinary skill in the art 
would readily understand how to select and screen polynucleotides encoding variants or 
fragments of SEQ ID NO: 5 without any undue experimentation. 

To expedite prosecution, claim 3 has been amended to recite "a biologically active 
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fragment comprising at least 150 contiguous amino acids of the amino acid sequence of SEQ ED 
NO:5, wherein said biologically active fragment has sphingosine kinase activity." Applicants are 
amending the claim solely to obtain expeditious allowance of the instant application. Support for 
this amendment to claim 3 can be found in the specification, for example, in Table 2 which 
points out the homology between SEQ ID NO: 5 and mouse sphingosine kinase (g3659694), and 
at p. 54, lines 11-24, which describes assays for measuring kinase activity. By this amendment, 
Applicants expressly do not disclaim equivalents of the invention which could include 
polypeptides or fragments having biological activities in addition to sphingosine kinase inducing 
activity. 

The Examiner suggests that, in order to satisfy the enablement requirement, knowledge of 
the regions of SEQ ED NO:5 that are tolerant to modification, the tolerance of kinases in general 
to modification, a scheme for modifying SEQ ED NO: 5 while obtaining the desired biological 
function, and guidance as to which were likely to successful is required. Applicants wish to 
point out that each of these criteria are directed to the function of the polypeptide not the 
polynucleotide. Applicants respectfully remind the Examiner that the claims are directed to 
polynucleotides , not polypeptides, and thus it is the functionality of the claimed polynucleotides, 
not the polypeptides encoded by them, that is relevant. 

With respect to the claimed variants of SEQ ID NO: 19, members of this genus may, for 
example, be useful even if they encode proteins that lack sphingosine kinase activity. For 
example, the variant polynucleotides could be used for the detection of sequences related to 
sphingosine kinase (see the specification at p. 42, lines 25-28) including sphingosine kinase 
variants that may be associated with disease states, such as the diseases listed in the specification 
at p. 43, line 2 through p. 44, line 5). See the specification at, for example, p. 44, lines 19-27 for 
disclosure of how to use the claimed sequences in diagnostic assays. The variant polynucleotides 
could also be used in microarrays to identify genetic variants, mutations, and polymorphisms, 
and for disease diagnosis and development and testing of therapeutic agents (see the specification 
at, for example, p. 45, lines 18-28). Thus one of ordinary skill in the art would know how to 
used the claimed polynucleotide variants without having to experimentally determine the 
biological function of the encoded proteins because the function or lack thereof of the protein is 
irrelevant. 
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Contrary to the Examiner's assertions, immunogenic fragments of SEQ ID NO:5 are 

amply enabled by the disclosure of the specification. For example, at page, lines, the 

specification describes methods for identifying immunogenic fragments. 

"Alternatively, the HRJOP amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding 
oligopeptide is synthesized and used to raise antibodies by means known to those of skill 
in the art. Methods for selection of appropriate epitopes, such as those near the 
C-terminus or in hydrophilic regions are well described in the art. (See, e.g., Ausubel, 
1995, supra , ch. 11.)" 
(Specification at page 55, lines 24-28) 

The specification further describes the use of immunogenic fragments to induce antibodies that 

bind specifically to a given region of a protein. 

"When a protein or a fragment of a protein is used to immunize a host animal, numerous 
regions of the protein may induce the production of antibodies which bind specifically to 
antigenic determinants (particular regions or three-dimensional structures on the protein). 
An antigenic determinant may compete with the intact antigen (i.e., the immunogen used 
to elicit the immune response) for binding to an antibody." 
(Specification at page 11, lines 28-32) 

At page 35, lines 5-11, the specification states: 

"For the production of antibodies, various hosts including goats, rabbits, rats, mice, 
humans, and others may be immunized by injection with HRIP or with any fragment or 
oligopeptide thereof which has immunogenic properties. Depending on the host species, 
various adjuvants may be used to increase immunological response. Such adjuvants 
include, but are not limited to, Freund's, mineral gels such as aluminum hydroxide, and 
surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil 
emulsions, KLH, and dinitrophenol. Among adjuvants used in humans, BCG (bacilli 
Calmette-Guerin) and Corynebacterium parvum are especially preferable." 

The specification continues at page 55, lines 29-35 with a description of the immunogenic 

fragments that could be used to induce antibodies: 

"Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 
431 A peptide synthesizer (Perkin-Elmer) using fmoc-chemistry and coupled to KLH 
(Sigma-Aldrich, St. Louis MO) by reaction with N-maleimidobenzoyl-N- 
hydroxysuccinimide ester (MBS) to increase immunogenicity. (See, e.g., Ausubel, 1995, 
supra .) Rabbits are immunized with the oligopeptide-KLH complex in complete Freund's 
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adjuvant. Resulting antisera are tested for antipeptide and anti-HRIP activity by, for 
example, binding the peptide or HRIP to a substrate, blocking with 1% BSA, reacting 
with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG." 

Immunogenic fragments of SEQ ID NO:5 by definition elicit antibodies that bind to SEQ 
ID NO:5. It is routine to produce antibodies that specifically bind to a protein by immunizing an 
appropriate host with oligopeptide fragments of a protein. It is well known in the art that it is 
possible to produce antibodies to almost any part of an antigen. Given the sequence of SEQ ID 
NO:5, one of skill in the art could readily identify immunogenic fragments of SEQ ID NO:5. 

Contrary to the standard set forth in Marzocchi and Borkowski, the Examiner has failed to 
provide any reasons why one would doubt that the guidance provided by the present 
specification would enable one to make and use the recited polynucleotides. Hence, a prima 
facie case for non-enablement has not been established. For at least the above reasons, 
withdrawal of the enablement rejection under 35 U.S.C. § 112, first paragraph, is respectfully 
requested. 

Written description rejection under 35 U.S.C. §112, 1 st paragraph 

Claims 3, 5-6, 8, and 10-11 have been rejected under the first paragraph of 35 U.S.C. 112 

for alleged lack of an adequate written description. This rejection is respectfully traversed. 

The requirements necessary to fulfill the written description requirement of 35 U.S.C. 112, first 

paragraph, are well established by case law. 

... the applicant must also convey with reasonable clarity to those skilled 
in the art that, as of the filing date sought, he or she was in possession of the 
invention. The invention is, for purposes of the "written description" inquiry, 
whatever is now claimed. Vas-Cath, Inc. v. Mahurkar, 19 USPQ2d 1111, 1117 
(Fed. Cir. 1991) 

Attention is also drawn to the Patent and Trademark Office's own "Guidelines for 

Examination of Patent Applications Under the 35 U.S.C. Sec. 112, para. 1", published January 5, 

2001, which provide that : 

An applicant may also show that an invention is complete by disclosure of 
sufficiently detailed, relevant identifying characteristics which provide evidence 
that applicant was in possession of the claimed invention, i.e., complete or partial 
structure, other physical and/or chemical properties, functional characteristics 
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when coupled with a known or disclosed correlation between function and 
structure, or some combination of such characteristics. What is conventional or 
well known to one of ordinary skill in the art need not be disclosed in detail. If a 
skilled artisan would have understood the inventor to be in possession of the 
claimed invention at the time of filing, even if every nuance of the claims is not 
explicitly described in the specification, then the adequate description requirement 
is met. (footnotes omitted.) 

Thus, the written description standard is fulfilled by both what is specifically disclosed 
and what is conventional or well known to one skilled in the art. 

SEQ ED NO:5 and SEQ ID NO: 19 (the polynucleotide sequence encoding SEQ ID 
NO:5) are specifically disclosed in the application (see, for example, page 6, lines 14-30). 
Variants of SEQ ID NO: 19 are described, for example, at page 24, lines 8-16. In particular, the 
variants of SEQ ID NO: 19 are described in the alternative as at least 80%, at least 90%, or at 
least 95% sequence identity to a polynucleotide sequence having SEQ ID NO: 19 at, for example, 
page 24, lines 11-16. Incyte clones in which the nucleic acids encoding the human sphingosine 
kinase were first identified and libraries from which those clones were isolated are described, for 
example, at pages 64-65 of the specification. Chemical and structural features of the protein 
encoded by SEQ ID NO: 19 are described, for example, on page 59, row 6. Given SEQ ID 
NO: 19, one of ordinary skill in the art would recognize naturally-occurring variants of SEQ ID 
NO: 19 having 90% sequence identity to SEQ ED NO: 19. Accordingly, the Specification 
provides an adequate written description of the recited polypeptide sequences. 

The Office Action has further asserted that the claims are not supported by an adequate 
written description because "many functionally unrelated DNAs are encompassed within the 
scope of these claims" (page 12 of the Office Action of June 4, 2003). Such a position is 
believed to present a misapplication of the law. 

1. The present claims specifically define the claimed genus through the 
recitation of chemical structure 

Court cases in which "DNA claims" have been at issue commonly emphasize that the 
recitation of structural features or chemical or physical properties are important factors to 
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consider in a written description analysis of such claims. For example, in Fiers v. Revel, 25 

USPQ2d 1601, 1606 (Fed. Cir. 1993), the court stated that: 

If a conception of a DNA requires a precise definition, such as by structure, 
formula, chemical name or physical properties, as we have held, then a description 
also requires that degree of specificity. 

In a number of instances in which claims to DNA have been found invalid, the courts 

have noted that the claims attempted to define the claimed DNA in terms of functional 

characteristics without any reference to structural features. As set forth by the court in University 

of California v. Eli Lilly and Co,, 43 USPQ2d 1398, 1406 (Fed. Cir. 1997): 

In claims to genetic material, however, a generic statement such as "vertebrate 
insulin cDNA" or "mammalian insulin cDNA," without more, is not an adequate 
written description of the genus because it does not distinguish the claimed genus 
from others, except by function. 

Thus, the mere recitation of functional characteristics of a DNA, without the definition of 
structural features, has been a common basis by which courts have found invalid claims to DNA. 
For example, in Lilly, 43 USPQ2d at 1407, the court found invalid for violation of the written 
description requirement the following claim of U.S. Patent No. 4,652,525: 

1. A recombinant plasmid replicable in procaryotic host containing within its 
nucleotide sequence a subsequence having the structure of the reverse transcript of 
an mRNA of a vertebrate, which mRNA encodes insulin. 

In Fiers, 25 USPQ2d at 1603, the parties were in an interference involving the following 

count: 

A DNA which consists essentially of a DNA which codes for a human fibroblast 
interferon-beta polypeptide. 

Party Revel in the Fiers case argued that its foreign priority application contained an 
adequate written description of the DNA of the count because that application mentioned a 
potential method for isolating the DNA. The Revel priority application, however, did not have a 
description of any particular DNA structure corresponding to the DNA of the count. The court 
therefore found that the Revel priority application lacked an adequate written description of the 
subject matter of the count. 

Thus, in Lilly and Fiers, nucleic acids were defined on the basis of functional 
characteristics and were found not to comply with the written description requirement of 35 



116425 



44 



09/937,060 



Docket No.: PF-0683 USN 



U.S.C. §1 12; i.e., "an mRNA of a vertebrate, which mRNA encodes insulin" in Lilly, and "DNA 
which codes for a human fibroblast interferon-beta polypeptide" in Fiers. In contrast to the 
situation in Lilly and Fiers, the claims at issue in the present application define polynucleotides 
in terms of chemical structure, rather than functional characteristics. For example, the "variant 
language" of independent claim 10, as presently amended, recites chemical structure to define the 
claimed genus: 

10. An isolated polynucleotide comprising a polynucleotide sequence selected 
from the group consisting of:...b) a naturally occurring polynucleotide sequence 
having at least 90% sequence identity to a polynucleotide sequence of SEP ID 
NO: 19, 

From the above it should be apparent that the claims of the subject application are 
fundamentally different from those found invalid in Lilly and Fiers. The subject matter of the 
present claims is defined in terms of the chemical structure of SEQ ID NO: 19. In the present 
case, there is no reliance merely on a description of functional characteristics of the 
polynucleotides recited by the claims. In fact, there is no recitation of functional characteristics. 
Moreover, if such functional recitations were included, it would add to the structural 
characterization of the recited polynucleotides. The polynucleotides defined in the claims of the 
present application recite structural features, and cases such as Lilly and Fiers stress that the 
recitation of structure is an important factor to consider in a written description analysis of claims 
of this type. By failing to base its written description inquiry "on whatever is now claimed," the 
Office Action failed to provide an appropriate analysis of the present claims and how they differ 
from those found not to satisfy the written description requirement in Lilly and Fiers 

2. The present claims do not define a genus which is "highly variant" 

Furthermore, the claims at issue do not describe a genus which could be characterized as 
"a large variable genus." Available evidence illustrates that the claimed genus is of narrow 
scope. 

In support of this assertion, the Examiner's attention is directed to the enclosed reference 
by Brenner et al. ("Assessing sequence comparison methods with reliable structurally identified 
distant evolutionary relationships," Proc. Natl. Acad. Sci. USA (1998) 95:6073-6078). Through 
exhaustive analysis of a data set of proteins with known structural and functional relationships 
and with <90% overall sequence identity, Brenner et al. have determined that 30% identity is a 
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reliable threshold for establishing evolutionary homology between two sequences aligned over at 
least 150 residues. (Brenner et al., pages 6073 and 6076.) Furthermore, local identity is 
particularly important in this case for assessing the significance of the alignments, as Brenner et 
al. further report that ^40% identity over at least 70 residues is reliable in signifying homology 
between proteins. (Brenner et al., page 6076.) 

The present application is directed, inter alia, to kinase proteins related to the amino acid 
sequence of SEQ ID NO:5. In accordance with Brenner et al, naturally occurring molecules may 
exist which could be characterized as kinase proteins and which have as little as 40% identity 
over at least 70 residues to SEQ ID NO:5. The "variant language" of the present claims recites, 
for example, polynucleotides encoding "a naturally-occurring amino acid sequence having at 
least 90% sequence identity to the amino acid sequence of SEQ ID NO:5" (note that SEQ ID 
NO:5 has 384 amino acid residues). This variation is far less than that of all potential kinase 
proteins related to SEQ ID NO:5, i.e., those kinase proteins having as little as 40% identity over 
at least 70 residues to SEQ ID NO:. 

3. The state of the art at the time of the present invention is further advanced 
than at the time of the Lilly and Fiers applications 

In the Lilly case, claims of U.S. Patent No. 4,652,525 were found invalid for failing to 
comply with the written description requirement of 35 U.S.C. §112. The '525 patent claimed the 
benefit of priority of two applications, Application Serial No. 801,343 filed May 27, 1977, and 
Application Serial No. 805,023 filed June 9, 1977. In the Fiers case, party Revel claimed the 
benefit of priority of an Israeli application filed on November 21, 1979. Thus, the written 
description inquiry in those case was based on the state of the art at essentially at the "dark ages" 
of recombinant DNA technology. 

The present application has a priority date of March 18, 1999. Much has happened in the 
development of recombinant DNA technology in the 24 or more years from the time of filing of 
the applications involved in Lilly and Fiers and the present application. For example, the 
technique of polymerase chain reaction (PCR) was invented. Highly efficient cloning and DNA 
sequencing technology has been developed. Large databases of protein and nucleotide sequences 
have been compiled. Much of the raw material of the human and other genomes has been 
sequenced. With these remarkable advances one of skill in the art would recognize that, given 
the sequence information of SEQ ID NO:5 and SEQ ID NO: 19, and the additional extensive 
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detail provided by the subject application, the present inventors were in possession of the 
claimed polynucleotide variants at the time of filing of this application. 

4. Summary 

The Office Action failed to base its written description inquiry "on whatever is now 
claimed." Consequently, the Action did not provide an appropriate analysis of the present claims 
and how they differ from those found not to satisfy the written description requirement in cases 
such as Lilly and Fiers. In particular, the claims of the subject application are fundamentally 
different from those found invalid in Lilly and Fiers. The subject matter of the present claims is 
defined in terms of the chemical structure of SEQ ID NO:5 or SEQ ED NO: 19. The courts have 
stressed that structural features are important factors to consider in a written description analysis 
of claims to nucleic acids and proteins. In addition, the genus of polynucleotides defined by the 
present claims is adequately described, as evidenced by Brenner et al and consideration of the 
claims of the '740 patent involved in Lilly. Furthermore, there have been remarkable advances in 
the state of the art since the Lilly and Fiers cases, and these advances were given no 
consideration whatsoever in the position set forth by the Office Action. Applicants respectfully 
request that this rejection be withdrawn. 

Rejections under 35 U.S.C. § 102(b) 

Claims 3 and 11 are rejected under 35 U.S.C. § 102(b) as allegedly being anticipated by 
Genbank accession # AA639414. These claims as presently amended, recite fragments 
comprising at least 150 contiguous amino acid residues or at least 500 contiguous nucleotides. 
This rejection has therefore been rendered moot. Withdrawal of this rejection is respectfully 
requested. 

Claims 3, 5-6, 8, and 10-11 are rejected under 35 U.S.C. §102(b) as allegedly being 
anticipated by Kohama et al Applicants respectfully point out that Kohama et al (JBC 
273:23722-8, 1998) was published on September 11, 1998, less than one year prior to 
Applicants' priority date of March 18, 1999. Therefore this reference does not qualify as prior 
art under 35 U.S.C. §102(b). Applicants respectfully request that this rejection be withdrawn. 
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Rejections under 35 U.S.C. § 102(a) 

Claims 3 and 11 are rejected under 35 U.S.C. § 102(a) as allegedly being anticipated by 
Genbank accession # AI042283, published September 24, 1998. This reference, however, was 
published after Applicants' date of invention. Attached is Reference No. 11, which demonstrates 
the date of invention of SEQ ID NO:5 as at least as early as December 13, 1996. Reference 1 1 is 
a print-out of the clone information for SEQ ID NO: 19 (Project ID 2415617), showing an "initial 
entry date" of December 13, 1996. This antedates this Genbank reference, thus removing it as 
prior art. Therefore, the Examiner has failed to make out a prima facie case because she has not 
cited a single prior art reference that discloses all of the elements and limitations of Applicants' 
present claims. Applicants respectfully request that the rejection be withdrawn. 

Claims 3, 5-6, and 8 are rejected under 35 U.S.C. § 102(a) as allegedly being anticipated 
by Young et al (W098/54963). The priority filing date of this reference is June 6, 1997 which is 
after Applicants' date of invention of SEQ ID NO:5 of December 13, 1996, as established above. 
Applicants' date of invention antedates the Young et al reference, thus eliminating it as prior 
art. Therefore, the Examiner has again failed to make out a prima facie case because she has not 
cited a single prior art reference that discloses all of the elements and limitations of Applicants' 
present claims. Applicants respectfully request that the rejection be withdrawn. 

Rejection under 35 U.S.C. § 103(a) 

Claim 10 is rejected under 35 U.S.C. § 103(a) as allegedly being obvious over Kohama et 
al in view of Genbank accession numbers D31133, AA232791, W63556, AA081 152, and 
AA026479. However, the Examiner has not met her burden of making a prima facie case of 
obviousness for at least the reasons given below. 

As stated previously, the Kohama et al reference does not qualify as prior art under 35 
U.S.C. § 102(b) because its date of publication is less than one year prior to the filing date of the 
present application. Furthermore, this reference does not qualify as prior art under 35 U.S.C. 
§ 102(a) because Applicants' date of invention of December 13, 1996 antedates the Kohama et al 
reference. Moreover, even assuming that all of the EST's combined in the Kohama et al 
reference to arrive at the putative human sequence contained therein could be asserted as prior 
art, there is no motivation to combine absent knowledge of Applicants' invention or the mouse 
sphingosine kinase sequence. This is impermissible hindsight reconstruction. Without the 
motivation to combine the Examiner has not met the burden of establishing a prima facie case of 
obviousness. Withdrawal of this rejection is therefore respectfully requested. 
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CONCLUSION 

In light of the above amendments and remarks, Applicants submit that the present 
application is fully in condition for allowance, and request that the Examiner withdraw the 
outstanding objections/rejections. Early notice to that effect is earnestly solicited. 

If the Examiner contemplates other action, or if a telephone conference would expedite 
allowance of the claims, Applicants invite the Examiner to contact the undersigned at the number 
listed below. 



Please charge Deposit Account No. 09-0108 in the amount of $ 1130.00 as set forth in 
the enclosed fee transmittal letter. If the USPTO determines that an additional fee is necessary, 
please charge any required fee to Deposit Account No. 09-0108. 



Respectfully submitted, 
INCYTE CORPORATION 
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r§&C * Cathleen M. Rocco 
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Direct Dial Telephone: (650) 845-4587 



Date: pg.<:^U^ 3^a3 )Cc~ )fU. ytCj^ ' 

Karin M. Gerstin^ 
Reg. No. 54,119 

Direct Dial Telephone: (650) 845-4889 

Customer No.: 27904 
3160 Porter Drive 
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Phone: (650) 855-0555 
Fax: (650) 849-8886 
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Abstract Sphingosine kinase catalyzes the phosphorylation of 
sphingosine to form sphingosine 1 -phosphate (SPP), a novel lipid 
mediator with both intra- and extracellular functions. Based on 
sequence identity to murine sphingosine kinase (mSPHKla), we 
cloned and characterized the first human sphingosine kinase 
(hSPHKl). The open reading frame of hSPHKl encodes a 584 
amino acid protein with 85% identity and 92% similarity to 
mSPHKla at the amino acid level. Similar to mSPHKla, when 
HEK293 cells were transfected with hSPHKl, there were 
marked increases in sphingosine kinase activity resulting in 
elevated SPP levels. hSPHKl also specifically phosphorylated 
D-iryrtro-sphingosine and to a lesser extent sphinganine, but not 
other lipids, such as D,L-rtri0-dihydrosphingosine, yV,yV-dimethyl- 
sphingosine, diacylglycerol, ceramide, or phosphatidyiinositol. 
Northern analysis revealed that hSPHKl was widely expressed 
with highest levels in adult liver, kidney, heart and skeletal 
muscle. Thus, hSPHKl belongs to a highly conserved unique 
lipid kinase family that regulates diverse biological functions. 
© 2000 Federation of European Biochemical Societies. 

Key words: Human sphingosine kinase; 
Sphingosine 1 -phosphate 



1. Introduction 

The metabolic product of sphingosine kinase (SPHK), 
sphingosine 1 -phosphate (SPP), is a lipid signaling molecule 
that acts both intra- and extracellularly to affect many bio- 
logical processes. These include mitogenesis [1,2], apoptosis 
[3], atherosclerosis [4] and inflammatory responses [5,6]. Spe- 
cific members of the EDG-1 family of G protein-coupled re- 
ceptors bind SPP (reviewed in [7,8]) and modulate chemotaxis 
[9,10], angiogenesis [10-12], neurite retraction and cell round- 
ing [13]. Because SPP levels are mainly regulated by the ac- 
tivity of SPHK, cloning and characterization of this enzyme 
are important for understanding its role in normal and patho- 
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logical processes. Previously, we purified SPHK to homoge- 
neity from rat kidneys [14] and subsequently identified mouse 
cDNAs encoding two forms of SPHK, designated mSPHKla 
and mSPHKlb, whose predicted proteins differ by only 10 
amino acids at their N-terminus [15]. The corresponding 
mRNAs may arise by alternative splicing. In this study, se- 
quence homologies to the mSPHKla cDNAs were used to 
identify and clone the first human homologue, hSPHKl. 
hSPHKl is ubiquitously expressed in adult tissues with high- 
est levels in liver, kidney, lung and skeletal muscle. Our results 
suggest that hSPHKl belongs to a family of highly conserved 
enzymes which differ from other known lipid kinases. 

2. Materials and methods 

2 J. Materials 

SPP, sphingosine, and yV,Af-dimethylsphingosine (DMS) were from 
Biomol Research Laboratory Inc. (Plymouth Meeting, PA). All other 
lipids were purchased from Avanti Polar Lipids (Birmingham, AL). 
[y- 32 P]ATP (3000 Ci/mmol) was purchased from Amersham (Arling- 
ton Heights, IL). Poly-L-lysine was from Boehringer Mannheim (In- 
dianapolis, IN). Alkaline phosphatase from bovine intestinal mucosa, 
type VII-NT, was from Sigma (St. Louis, MO). Restriction enzymes 
were from New England Biolabs (Beverly, MA). Lipofectamine Plus 
was from Life Technologies (Gaithersburg, MD). 

2.2. Human sphingosine kinase cDNA cloning 

BLAST searches using mSPHKla sequences identified an EST 
clone (AA026479) which contained sequences homologous to several 
conserved domains of mSPHK [15]. To obtain a full-length cDNA, 
the 5 '-end of hSPHKl was extended by rapid amplification of cDNA 
ends/polymerase chain reaction (RACE-PCR; Life Technologies). 
First, cDNA was synthesized from HEK293 poly(A) + RNA with a 
gene-specific antisense primer hspkl-GSPl (5'-ACCATTGTCCAGT- 
GAG). Then two consecutive PCR reactions using LA Taq (TaKaRa) 
were performed. First PCR: 5 'RACE Abridged Anchor Primer and 
the antisense primer hspkl-GSP2 (5'-TTCCTACAGGGAGG- 
TAGGCC) at 94°C for 2 min followed by 30 cycles of amplification 
(94°C for 1 min, 55°C for 1 min, 72°C for 2 min) and primer exten- 
sion at 72°C for 5 min. Second PCR : Abridged Universal Amplifica- 
tion Primer and the antisense primer hspkl-GSP3 (5'-GGCTGCCA- 
GACGCAGGAAGG) using a program similar to the first PCR but 
with annealing at 65°C. The PCR products were cloned into pCR 2.1 
(TA Cloning, Invitrogen) and sequences confirmed by automated se- 
quencing. To make expression constructs, a primer set was designed 
as follows: sense primer containing a Kozak sequence and ATG start 
codon, sphkl-GSP4 ( 5 ' -GCC ACC ATGG ATCC AGCGGGCGGCC- 
CC); antisense primer, sphkl-GSP5 (5'-TCATAAGGGCTCTTCTG- 
GCGGTGGCATCTG). The PCR reaction was performed using hu- 
man fetus Marathon-Ready cDNA (Ctontech) as template with the 
above primers, and the amplification product was subcloned into 
pCR3.1 (Eukaryotic TA Cloning, Invitrogen). In addition, hSPHKl 
was tagged at the N-terminus by subcloning into a pcDNA-c-myc 
vector [2] using high fidelity taq polymerase (Pfu, Stratagene). 
hSPHKl accession number is AF238083. 
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2.3. Cell culture and expression of sphingosine kinase 

Human embryonic kidney cells (HEK293, ATCC CRL-1573) were 
grown in high glucose Dulbecco's modified Eagle's medium (DMEM) 
containing 100 U/ml penicillin, 100 ug/ml streptomycin and 2 mM 
L-glutamine supplemented with 10% fetal bovine serum [15]. Cells 
were transfected with either pcDNA3.1 or pCR3.1 containing 
hSPHKl using Lipofectamine Plus according to the manufacturer's 
protocol. Transfection efficiencies were typically about 40%. 

2.4. Measurement of sphingosine kinase activity 

Cytosolic sphingosine kinase activity was determined with 50 uM 
sphingosine, dissolved in 5% Triton X-100 (final concentration 
0.25%), and [y- 32 P]ATP (10 uCi, 1 mM) containing MgCl 2 (10 mM) 
as previously described [15]. In some experiments, sphingosine was 
added as a complex with bovine serum albumin (BSA) as previously 
described [15]. Specific activity is expressed as pmol SPP formed per 
min per mg protein. 

2.5. Lipid extraction and measurement of SPP, sphingosine, and 
ceramide 

Cells were washed with phosphate buffered saline and scraped in 
1 ml of methanol containing 2.5 ul concentrated HC1. Lipids were 
extracted by adding 2 ml chloroform/1 M NaCl (1:1, v/v) and 100 ul 
3 N NaOH and phases were separated. The basic aqueous phase 
containing SPP, and devoid of sphingosine, ceramide, and the major- 
ity of phospholipids, was transferred to a siliconized glass tube. The 
organic phases were re-extracted with 1 ml methanol/1 M NaCl (1:1, 
v/v) plus 50 ul 3 N NaOH, and the aqueous fractions combined. Mass 
measurements of SPP in the aqueous phase were carried out as pre- 
viously described [16]. Sphingosine and ceramide in the organic phase 
were determined by enzymatic methods using sphingosine kinase and 
diacylglycerol kinase, respectively [17]. Total phospholipids present in 
lipid extracts were also quantified [17]. 



K£ Nava et aitFEBS Utters 473 (2000) 81-84 

2.6. Northern blotting analysis 

Poly(A) + RNA blots containing 2 ug of poly(A) 4 " RNA per lane 
from multiple adult human tissues (Clontech) were hybridized with 
the 0.6 kb EcoRVlSphl fragment of pCR3.1-hSPHKl, which was gel- 
purified and labeled with pPJdCTP by random priming. Hybridiza- 
tion in ExpressHyb buffer (Clontech) was carried out at 65°C over- 
night according to the manufacturer's protocol. Blots were reprobed 
with a human (3-actin control probe (Clontech). Bands were quantified 
using a Molecular Dynamics Phosphoimager. 

3. Results and discussion 

3.1. Cloning of hSPHKl 

BLAST searches of the EST database identified a human 
homologue of murine SPHK, EST AA026479, with similarity 
to the 3' end of mSPHKla. This sequence was used to design 
specific primers and 5' RACE was performed on mRNA ex- 
tracted from HEK293 cells to obtain the full-length cDNA of 
hSPHKl. The open reading frame encodes a protein with 384 
amino acids, and 85% identity and 92% similarity to 
mSPHKla at the amino acid level (Fig. 1). We previously 
found by sequence alignment that SPHKs from mouse, yeast 
and Caenorhabditis elegans share several conserved blocks of 
amino acids [15]. Similarly, hSPHKl contains these conserved 
regions (C1-C5, Fig. 1), including the invariant positively 
charged motif, GGKGK, in the CI domain, which may be 
part of the ATP binding site of this novel class of lipid ki- 
nases. 
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Fig. 1. Predicted amino acid sequence of hSPHKl and alignment of the conserved domains. ClustalW alignment of SPHKs from mouse and 
human. Identical and conserved amino acid substitutions are shaded dark and light gray, respectively. The conserved domains (C1-C5) are in- 
dicated by lines and the invariant positively charged motif GGKGK by asterisks. 
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Fig. 2. Activity of hSPHKl expressed in HEK293 cells. A: HEK293 cells were transiently transfected with empty vector or vector containing 
either mSPHKla or hSPHKl. SPHK activity was measured in cytosol (filled bars) and particulate pellet (open bars) 24 h after transfection us- 
ing sphingosine-BSA complexes or sphingosine-Triton X-I00 micelles as substrate as indicated. SPHK activity in vector transfected cells was 
84 ±2 and 1 34 ± 27 pmol/min/mg using sphingosine-BSA complexes or sphingosine-Triton X-100 micelles as substrate, respectively. Data are 
means ±S.D. and are representative of two independent experiments performed in triplicate. B: Changes in mass levels of SPP, sphingosine, 
and ceramide. Mass levels of SPP, sphingosine and ceramide in cells transfected with empty vector (open bars) or vector containing hSPHKl 
(filled bars) were measured after 24 h. Data are expressed as pmol/nmol phospholipid and are means ±S.D. of triplicate determinations. 



3.2. hSPHKl encodes a functional sphingosine kinase 

HEK293 cells were transfected with expression vectors con- 
taining hSPHKl to determine whether it encodes a bona fide 
SPHK. Modest levels of endogenous SPHK activity were de- 
tected in cells transfected with an empty vector (Fig. 2A). 
Twenty-four hours after transfection with pcDNA3.1- 
hSPHKl, the SPHK activity increased approximately 600- 
fold and remained at this level for at least 2 days. For com- 
parison, a similar increase in activity was observed after trans- 
fection with mSPHKla (Fig. 2A). Similar results were ob- 
tained when cells were transfected with hSPHKl in pCR3.1. 
In agreement with previous results with mSPHKla [15], 
hSPHKl was stimulated by Triton X-100. Both membrane- 
associated and cytosolic SPHK activity have been described in 
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Fig. 3. A: Substrate specificity of hSPHKl. HEK293 cells were 
transfected with hSPHKl and SPHK-dependent phosphorylation of 
various sphingosine analogs or other lipids (50 uM) was measured 
using cell lysates as enzyme source. DAG, diacylglycerol; PI, phos- 
phatidylinositol; C2-CER, /V-acetyl-sphingosine. B: DMS and DHS 
are inhibitors of hSPHKl. SPHK activity in HEK293 ceil lysates 24 
h after transfection with hSPHKl was measured with 10 uM SPP 
in the absence or presence of 20 uM and 40 uM DMS or DHS. 
Data are means ±S.D. of triplicate determinations and are expressed 
as percent inhibition. 



mammalian tissues and cell lines [1,18-21]. In cells transfected 
with hSPHKl, approximately 70% of the SPHK activity was 
found in the cytosol and only about 30% was membrane-as- 
sociated (Fig. 2A). Similarly, we previously found that the 
majority of mSPHKla activity was also expressed in the cy- 
tosol [2,15]. Kyte-Doolittle hydropathy plots did not suggest 
the presence of any potential hydrophobic membrane span- 
ning domains in the primary structure of hSPHKl. 

Transfection of HEK293 cells with hSPHKl also resulted in 
changes in levels of sphingolipid metabolites (Fig. 2B). Mass 
levels of SPP increased 5.7-fold compared to cells transfected 
with vector alone, with a 18% decrease in levels of both sphin- 
gosine and ceramide. However, because intracellular ceramide 
pools are much larger than sphingosine pools, the absolute 
decrease of ceramide was greater than the decrease in sphin- 
gosine mass. These results suggest that transfected hSPHKl is 
active in intact cells, and that kinase overexpression can alter 
the intracellular balance of sphingolipid metabolites. 

3.3. Substrate specificity of hSPHKl 

The naturally occurring D-(+)-erythro-transAsomQr of sphin- 
gosine and ery/W-dihydrosphingosine (sphinganine) were the 
best substrates for hSPHKl (Fig. 3 A). However, similar to the 
specificity of mSPHKla [15], sphingosine was more efficiently 
phosphorylated than sphinganine. Moreover, other sphingo- 
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Fig. 4. Tissue-specific expression of hSPHKl by Northern blot anal- 
ysis. Top panel: A hSPHKl probe was hybridized to a poly(A) + 
RNA blot with the human tissues indicated at the top of each lane 
as described in Section 2. Bottom panel: A (3-actin probe was used 
to reprobe the blot. 
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lipids, including Dx-r/ireo-dihydrosphingosine (DHS) and C2- 
ceramide, as well as diacylglycerol and phosphatidylinositol 
were not substrates (Fig. 3A). With o-eryf/tro-sphingosine as 
substrate, half-maximal velocity was found at 5 uJvl, in excel- 
lent agreement with K m values previously determined with rat 
kidney SPHK [14] and recombinant mSPHKla [15]. DMS 
and DHS have previously been used to inhibit SPHK and 
block increases in SPP induced by various physiological stim- 
uli [1,3,22]. Both of these sphingolipids also inhibited 
hSPHKl and similar to their inhibitory effects on mSPHKla 
[15], DHS was slightly more potent than DMS (Fig. 3B). 

3.4. Tissue distribution of hSPHKl expression 

The tissue distribution of SPHK1 mRNA expression in 
adult human tissues was analyzed by Northern blotting 
(Fig. 4). In most tissues, including adult brain, heart, spleen, 
lung, kidney, and testis, a predominant 1.9 kb mRNA species 
was detected. Expression was highest in adult liver, heart and 
skeletal muscle. In comparison, we previously showed that 
mSPHKla expression is greatest in mouse spleen, lung, kid- 
ney, testis and heart, with much lower expression in skeletal 
muscle [15]. 

In summary, hSPHKl is the human homolog of mSPHKl. 
Based on EST sequences, hSPHKl has been localized on 
chromosome 17q25.2 at the marker stSG28540 (D17S785- 
D17S836 Reference Interval, UniGene cluster Hs. 68061, 
URL: http://www.ncbi.nlm.nih.gov/unigene/clust.cgi7org = hs 
and cid = 68061). hSPHKl belongs to a conserved family of 
genes that is distinct from other known lipid kinases. Molec- 
ular cloning and characterization of members of the SPHK 
family should help to clarify their potential roles in various 
human diseases as their product, SPP, has been implicated as 
an important regulatory component of biological processes 
including growth, survival, allergy, chemotaxis, and angiogen- 
esis. 
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The availability of genome-scale DNA sequence information and reagents has radically altered life-science 
research. This revolution has led to the development of a new scientific subdiscipline derived from a combina- 
tion of the fields of toxicology and genomics. This subdiscipline, termed toxicogenomics, is concerned with the 
identification of potential human and environmental toxicants, and their putative mechanisms of action, through 
the use of genomics resources. One such resource is DNA microarrays or "chips," which allow the monitoring of 
the expression levels of thousands of genes simultaneously. Here we propose a general method by which gene 
expression, as measured by cDNA microarrays, can be used as a highly sensitive and informative marker for 
toxicity. Our purpose is to acquaint the reader with the development and current state of microarray technol- 
ogy and to present our view of the usefulness of microarrays to the field of toxicology. Mol. Carcinog. 24:153- 

159, 1999. © 1999 Wiley-Liss, Inc. 
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INTRODUCTION 

Technological advancements combined with in- 
tensive DNA sequencing efforts have generated an 
enormous database of sequence information over the 
past decade. To date, more than 3 million sequences, 
totaling over 2.2 billion bases [1], are contained 
within the GenBank database, which includes the 
complete sequences of 19 different organisms [2]. The 
first complete sequence of a free-living organism, 
Haemophilus influenzae, was reported in 1995 [3] and 
was followed shortly thereafter by the first complete 
sequence of a eukaryote, Saccharomyces cervisiae [4]. 
The development of dramatically improved sequenc- 
ing methodologies promises that complete elucida- 
tion of the Homo sapiens DNA sequence is not far 
behind [5]. 

To exploit more fully the wealth of new sequence 
information, it was necessary to develop novel meth- 
ods for the high-throughput or parallel monitoring 
of gene expression. Established methods such as 
northern blotting, RNAse protection assays, SI nu- 
clease analysis, plaque hybridization, and slot blots 
do not provide sufficient throughput to effectively 
utilize the new genomics resources. Newer methods 
such as differential display [6], high-density filter 
hybridization [7,8], serial analysis of gene expression 
[9], and cDNA- and oligonucleotide-based microarray 
"chip" hybridization [10-12] are possible solutions 
to this bottleneck. It is our belief that the microarray 
approach, which allows the monitoring of expres- 
sion levels of thousands of genes simultaneously, is 
a tool of unprecedented power for use in toxicology 
studies. 



Almost without exception, gene expression is al- 
tered during toxicity, as either a direct or indirect 
result of toxicant exposure. The challenge facing 
toxicologists is to define, under a given set of ex- 
perimental conditions, the characteristic and spe- 
cific pattern of gene expression elicited by a given 
toxicant. Microarray technology offers an ideal plat- 
form for this type of analysis and could be the foun- 
dation for a fundamentally new approach to 
toxicology testing. 

MICROARRAY DEVELOPMENT AND APPLICATIONS 
cDNA Microarrays 

In the past several years, numerous systems were 
developed for the construction of large-scale DNA 
arrays. All of these platforms are based on cDNAs 
or oligonucleotides immobilized to a solid sup- 
port. In the cDNA approach, cDNA (or genomic) 
clones of interest are arrayed in a multi-well for- 
mat and amplified by polymerase chain reaction. 
The products of this amplification, which are usu- 
ally 500- to 2000-bp clones from the 3' regions of 
the genes of interest, are then spotted onto solid 
support by using high-speed robotics. By using 
this method, microarrays of up to 10 000 clones 
can be generated by spotting onto a glass substrate 
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[13,14]. Sample detection for microarrays on glass 
involves the use of probes labeled with fluores- 
cent or radioactive nucleotides. 

Fluorescent cDNA probes are generated from con- 
trol and test RNA samples in single-round reverse-tran- 
scription reactions in the presence of fluorescently 
tagged dUTP (e.g., Cy3-dUTP and Cy5-dUTP), which 
produces control and test products labeled with dif- 
ferent fluors. The cDNAs generated from these two 
populations, collectively termed the "probe/' are then 
mixed and hybridized to the array under a glass cov- 
erslip [10,11,15]. The fluorescent signal is detected 
by using a custom-designed scanning confocal mi- 
croscope equipped with a motorized stage and lasers 
for fluor excitation [10,11,15]. The data are analyzed 
with custom digital image analysis software that de- 
termines for each DNA feature the ratio of fluor 1 to 
fluor 2, corrected for local background [16,17]. The 
strength of this approach lies in the ability to label 
RNAs from control and treated samples with differ- 
ent fluorescent nucleotides, allowing for the simul- 
taneous hybridization and detection of both 
populations on one microarray. This method elimi- 
nates the need to control for hybridization between 
arrays. The research groups of Drs. Patrick Brown and 
Ron Davis at Stanford University spearheaded the 
effort to develop this approach, which has been suc- 
cessfully applied to studies of Arabidopsis thaliana 
RNA [10], yeast genomic DNA [15], tumorigenic ver- 
sus non-tumorigenic human tumor cell lines [11], 
human T-cells [18], yeast RNA [19], and human in- 
flammatory disease-related genes [20], The most dra- 
matic result of this effort was the first published 
account of gene expression of an entire genome, that 
of the yeast Saccharomyces cervisiae [21]. 

In an alternative approach, large numbers of cDNA 
clones can be spotted onto a membrane support, al- 
beit at a lower density [7,22]. This method is useful 
for expression profiling and large-scale screening and 
mapping of genomic or cDNA clones [7,22-24]. In 
expression profiling on filter membranes, two dif- 
ferent membranes are used simultaneously for con- 
trol and test RNA hybridizations, or a single 
membrane is stripped and reprobed. The signal is 
detected by using radioactive nucleotides and visu- 
alized by phosphorimager analysis or autoradiogra- 
phy. Numerous companies now sell such cDNA 
membranes and software to analyze the image data 
[25-27]. 

Oligonucleotide Microarrays 

Oligonucleotide microarrays are constructed either 
by spotting prefabricated oligos on a glass support 
[13] or by the more elegant method of direct in situ 
oligo synthesis on the glass surface by photolithog- 
raphy [28-30]. The strength of this approach lies in 
its ability to discriminate DNA molecules based on 
single base-pair difference. This allows the applica- 
tion of this method to the fields of medical diagnos- 



tics, pharmacogenetics, and sequencing by hybrid- 
ization as well as gene-expression analysis. 

Fabrication of oligonucleotide chips by photoli- 
thography is theoretically simple but technically 
complex [29,30]. The light from a high-intensity 
mercury lamp is directed through a photolitho- 
graphic mask onto the silica surface, resulting in 
deprotection of the terminal nucleotides in the illu- 
minated regions. The entire chip is then reacted with 
the desired free nucleotide, resulting in selected chain 
elongation. This process requires only 4n cycles 
(where n = oligonucleotide length in bases) to syn- 
thesize a vast number of unique oligos, the total num- 
ber of which is limited only by the complexity of the 
photolithographic mask and the chip size [29,31,32]. 

Sample preparation involves the generation of 
double-stranded cDNA from cellular poly(A)+ RNA 
followed by antisense RNA synthesis in an in vitro 
transcription reaction with biotinylated or fluor- 
tagged nucleotides. The RNA probe is then frag- 
mented to facilitate hybridization. If the indirect 
visualization method is used, the chips are incubated 
with fluor-Iinked streptavidin (e.g., phycoerythrin) 
after hybridization [12,33]. The signal is detected with 
a custom confocal scanner [34]. This method has 
been applied successfully to the mapping of genomic 
library clones [35], to de novo sequencing by hybrid- 
ization [28,36], and to evolutionary sequence com- 
parison of the BRCA1 gene [37]. In addition, 
mutations in the cystic fibrosis [38] and BRCA1 [39] 
gene products and polymorphisms in the human im- 
munodeficiency virus- 1 clade B protease gene [40] 
have been detected by this method. Oligonucleotide 
chips are also useful for expression monitoring [33] 
as has been demonstrated by the simultaneous evalu- 
ation of gene-expression patterns in nearly all open 
reading frames of the yeast strain 5. cerevisiae [12]. 
More recently, oligonucleotide chips have been used 
to help identify single nucleotide polymorphisms in 
the human [41] and yeast [42] genomes. 

THE USE OF MICROARRAYS IN TOXICOLOGY 

Screening for Mechanism of Action 

The field of toxicology uses numerous in vivo 
model systems, including the rat, mouse, and rab- 
bit, to assess potential toxicity and these bioassays 
are the mainstay of toxicology testing. However, in 
the past several decades, a plethora of in vitro tech- 
niques have been developed to measure toxicity, 
many of which measure toxicant-induced DNA dam- 
age. Examples of these assays include the Ames test, 
the Syrian hamster embryo cell transformation as- 
say, micronucleus assays, measurements of sister 
chromatid exchange and unscheduled DNA synthe- 
sis, and many others. Fundamental to all of these 
methods is the fact that toxicity is often preceded 
by, and results in, alterations in gene expression. In 
many cases, these changes in gene expression are a 
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far more sensitive, characteristic, and measurable 
endpoint than the toxicity itself. We therefore pro- 
pose that a method based on measurements of the 
genome-wide gene expression pattern of an organ- 
ism after toxicant exposure is fundamentally infor- 
mative and complements the established methods 
described above. 

We are developing a method by which toxicants 
can be identified and their putative mechanisms of 
action determined by using toxicant-induced gene ex- 
pression profiles. In this method, in one or more de- 
fined model systems, dose and time-course parameters 
are established for a series of toxicants within a given 
prototypic class (e.g., polycyclic aromatic hydrocar- 
bons (PAHs)). Cells are then treated with these agents 
at a fixed toxicity level (as measured by cell survival), 
RNA is harvested, and toxicant-induced gene expres- 
sion changes are assessed by hybridization to a cDNA 
microarray chip (Figure 1). We have developed a cus- 
tom DNA chip, called ToxChip vl.O, specifically for 
this purpose and will discuss it in more detail below. 
The changes in gene expression induced by the test 
agents in the model systems are analyzed, and the 
common set of changes unique to that class of toxi- 
cants, termed a toxicant signature, is determined. 

This signature is derived by ranking across all ex- 
periments the gene-expression data based on rela- 

Control 
Population 



tive fold induction or suppression of genes in treated 
samples versus untreated controls and selecting the 
most consistently different signals across the sample 
set. A different signature may be established for each 
prototypic toxicant class. Once the signatures are de- 
termined, gene-expression profiles induced by un- 
known agents in these same model systems can then 
be compared with the established signatures. A match 
assigns a putative mechanism of action to the test 
compound. Figure 2 illustrates this signature method 
for different types of oxidant stressors, PAHs, and 
peroxisome proliferators. In this example, the un- 
known compound in question had a gene-expres- 
sion profile similar to that of the oxidant stressors in 
the database. We anticipate that this general method 
will also reveal cross talk between different pathways 
induced by a single agent (e.g., reveal that a com- 
pound has both PAH-like and oxidant-like proper- 
ties). In the future, it may be necessary to distinguish 
very subtle differences between compounds within 
a very large sample set (e.g., thousands of highly simi- 
lar structural isomers in a combinatorial chemistry 
library or peptide library). To generate these highly 
refined signatures, standard statistical clustering tech- 
niques or principal-component analysis can be used. 

For the studies outlined in Figure 2, we developed 
the custom cDNA microarray chip ToxChip vl.O. 
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Figure 1 . Simplified overview of the method for sample trative purposes, samples derived from cell culture are depicted 
preparation and hybridization to cDNA microarrays. For illus- although other sample types are amenable to this analysis. 
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Figure 2. Schematic representation of the method for iden- 
tification of a toxicant's mechanism of action. In this method, 
gene-expression data derived from exposure of model sys- 
tems to known toxicants are analyzed, and a set of changes 
characteristic to that type of toxicant (termed the toxicant 
signature) is identified. As depicted, oxidant stressors produce 



consistent changes in group A genes (indicated by red and 
green circles), but not group B or C genes (indicated by gray 
circles). The set of gene-expression changes elicited by the 
suspected toxicant is then compared with these characteristic 
patterns, and a putative mechanism of action is assigned to 
the unknown agent. 



The 2090 human genes that comprise this subarray 
were selected for their well-documented involve- 
ment in basic cellular processes as well as their re- 
sponses to different types of toxic insult. Included 
on this list are DNA replication and repair genes, 
apoptosis genes, and genes responsive to PAHs and 
dioxin-like compounds, peroxisome proliferators, 
estrogenic compounds, and oxidant stress. Some of 
the other categories of genes include transcription 
factors, oncogenes, tumor suppressor genes, cyclins, 
kinases, phosphatases, cell adhesion and motility 
genes, and homeobox genes. Also included in this 
group are 84 housekeeping genes, whose hybridiza- 
tion intensity is averaged and used for signal nor- 
malization of the other genes on the chip. To date, 
very few toxicants have been shown to have appre- 
ciable effects on the expression of these housekeep- 
ing genes. However, this housekeeping list will be 
revised if new data warrant the addition or deletion 
of a particular gene. Table 1 contains a general de- 
scription of some of the different classes of genes 
that comprise ToxChip vl.O. 

When a toxicant signature is determined, the 
genes within this signature are flagged within the 
database. When uncharacterized toxicants are then 
screened, the data can be quickly reformatted so that 
blocks of genes representing the different signatures 



are displayed [11]. This facilitates rapid, visual in- 
terpretation of data. We are also developing Tox- 
Chip v2.0 and chips for other model systems, 
including rat, mouse, Xenopus, and yeast, for use in 
toxicology studies. 

Animal Models in Toxicology Testing 

The toxicology community relies heavily on the 
use of animals as model systems for toxicology test- 
ing. Unfortunately, these assays are inherently ex- 
pensive, require large numbers of animals and take a 
long time to complete and analyze. Therefore, the 
National Institute of Environmental Health Sciences 
(NIEHS), the National Toxicology Program, and the 
toxicology community at large are committed to re- 
ducing the number of animals used, by developing 
more efficient and alternative testing methodologies. 
Although substantial progress has been made in the 
development of alternative methods, bioassays are 
still used for testing endpoints such as neurotoxic- 
ity, immunotoxicity, reproductive and developmen- 
tal toxicology, and genetic toxicology. The rodent 
cancer bioassay is a particularly expensive and time- 
consuming assay, as it requires almost 4 yr, 1200 
animals, and millions of dollars to execute and ana- 
lyze [43]. In vitro experiments of the type outlined 
in Figure 2 might provide evidence that an unknown 
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Table 1. ToxChip v1.0: A Human cDNA Microarray 
Chip D signed to Pet ct R sponses to Toxic Insult 

No. of genes 



Gene category on chip 



Apoptosis 72 

DNA replication and repair 99 

Oxidative stress/redox homeostasis 90 

Peroxisome proliferator responsive 22 

Dioxin/PAH responsive 12 

Estrogen responsive 63 

Housekeeping 84 

Oncogenes and tumor suppressor genes 76 

Cell-cycle control 51 

Transcription factors 1 3 1 

Kinases 276 

Phosphatases 88 

Heat-shock proteins 23 

Receptors 349 

Cytochrome P450s 30 



*This list is intended as a general guide. The gene categories are not 
unique, and some genes are listed in multiple categories. 

agent is (or is not) responsible for eliciting a given 
biological response. This information would help to 
select a bioassay more specifically suited to the agent 
in question or perhaps suggest that a bioassay is not 
necessary, which would dramatically reduce cost, 
animal use, and time. 

The addition of microarray techniques to stan- 
dard bioassays may dramatically enhance the sen- 
sitivity and interpretability of the bioassay and 
possibly reduce its cost. Gene-expression signatures 
could be determined for various types of tissue-spe- 
cific toxicants, and new compounds could be 
screened for these characteristic signatures, provid- 
ing a rapid and sensitive in vivo test. Also, because 
gene expression is often exquisitely sensitive to low 
doses of a toxicant, the combination of gene-expres- 
sion screening and the bioassay might allow the use 
of lower toxicant doses, which are more relevant to 
human exposure levels, and the use of fewer ani- 
mals. In addition, gene-expression changes are nor- 
mally measured in hours or days, not in the months 
to years required for tumor development. Further- 
more, microarrays might be particularly useful for 
investigating the relationship between acute and 
chronic toxicity and identifying secondary effects 
of a given toxicant by studying the relationship 
between the duration of exposure to a toxicant and 
the gene-expression profile produced. Thus, a bio- 
assay that incorporates gene-expression signatures 
with traditional endpoints might be substantially 
shorter, use more realistic dose regimens, and cost 
substantially less than the current assays do. 

These considerations are also relevant for branches 
of toxicology not related to human health and not 
using rodents as model systems, such as aquatic toxi- 
cology and plant pathology. Bioassays based on the 
flathead minnow, Daphnia, and Arabadopsis could 
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also be improved by the addition of microarray analy- 
sis. The combination of microarrays with traditional 
bioassays might also be useful for investigating some 
of the more intractable problems in toxicology re- 
search, such as the effects of complex mixtures and 
the difficulties in cross-species extrapolation. 

Exposure Assessment, Environmental Monitoring, 
and Drug Safety 

The currently used methods for assessment of ex- 
posure to chemical toxicants are based on measure- 
ment of tissue toxin levels or on surrogate markers 
of toxicity, termed biomarkers (e.g., peripheral blood 
levels of hepatic enzymes or DNA adducts). Because 
gene expression is a sensitive endpoint, gene expres- 
sion as measured with microarray technology may 
be useful as a new biomarker to more precisely iden- 
tify hazards and to assess exposure. Similarly, 
microarrays could be used in an environmental- 
monitoring capacity to measure the effect of poten- 
tial contaminants on the gene-expression profiles 
of resident organisms. In an analogous fashion, 
microarrays could be used to measure gene-expres- 
sion endpoints in subjects in clinical trials. The com- 
bination of these gene-expression data and more 
established toxic endpoints in these trials could be 
used to define highly precise surrogates of safety. 

Gene-expression profiles in samples from exposed 
individuals could be compared to the profiles of the 
same individuals before exposure. From this infor- 
mation, the nature of the toxic exposure can be de- 
termined or a relative clinical safety factor estimated. 
In the future it may also be possible to estimate not 
only the nature but the dose of the toxicant for a 
given exposure, based on relative gene-expression 
levels. This general approach may be particularly 
appropriate for occupational-health applications, in 
which unexposed and exposed samples from the 
same individuals may be obtainable. For example, 
a pilot study of gene expression in peripheral-blood 
lymphocytes of Polish coke-oven workers exposed 
to PAHs (and many other compounds) is under con- 
sideration at the NIEHS. An important consideration 
for these types of studies is that gene expression can 
be affected by numerous factors, including diet, 
health, and personal habits. To reduce the effects 
of these confounding factors, it may be necessary 
to compare pools of control samples with pools of 
treated samples. In the future it may be possible to 
compare exposed sample sets to a national database 
of human-expression data, thus eliminating the 
need to provide an unexposed sample from the same 
individual. Efforts to develop such a national gene- 
expression database are currently under way [44,45]. 
However, this national database approach will re- 
quire a better understanding of genome-wide gene 
expression across the highly diverse human popu- 
lation and of the effects of environmental factors 
on this expression. 
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Alleles, Oligo Arrays, and Toxicogenetics 

Gene sequences vary between individuals, and 
this variability can be a causative factor in human 
diseases of environmental origin [46,47]. A new area 
of toxicology, termed toxicogenetics, was recently 
developed to study the relationship between genetic 
variability and toxicant susceptibility. This field is 
not the subject of this discussion, but it is worth- 
while to note that the ability of oligonucleotide ar- 
rays to discriminate DNA molecules based on single 
base-pair differences makes these arrays uniquely 
useful for this type of analysis. Recent reports dem- 
onstrated the feasibility of this approach [41,42]. 
The NIEHS has initiated the Environmental Genome 
Project to identify common sequence polymor- 
phisms in 200 genes thought to be involved in en- 
vironmental diseases [48]. In a pilot study on the 
feasibility of this application to the Environmental 
Genome Project, oligonucleotide arrays will be used 
to resequence 20 candidate genes. This toxicogenetic 
approach promises to dramatically improve our un- 
derstanding of interindividual variability in disease 
susceptibility. 

FUTURE PRIORITIES 

There are many issues that must be addressed be- 
fore the full potential of microarrays in toxicology 
research can be realized. Among these are model sys- 
tem selection, dose selection, and the temporal na- 
ture of gene expression. In other words, in which 
species, at what dose, and at what time do we look 
for toxicant-induced gene expression? If human 
samples are analyzed, how variable is global gene 
expression between individuals, before and after toxi- 
cant exposure? What are the effects of age, diet, and 
other factors on this expression? Experience, in the 
form of large data sets of toxicant exposures, will 
answer these questions. 

One of the most pressing issues for array scientists 
is the construction of a national public database 
(linked to the existing public databases) to serve as a 
repository for gene-expression data. This relational 
database must be made available for public use, and 
researchers must be encouraged to submit their ex- 
pression data so that others may view and query the 
information. Researchers at the National Institutes 
of Health have made laudable progress in develop- 
ing the first generation of such a database [44,45]. In 
addition, improved statistical methods for gene clus- 
tering and pattern recognition are needed to ana- 
lyze the data in such a public database. 

The proliferation of different platforms and meth- 
ods for microarray hybridizations will improve 
sample handling and data collection and analysis and 
reduce costs. However, the variety of microarray 
methods available will create problems of data com- 
patibility between platforms. In addition, the near- 
infinite variety of experimental conditions under 



which data will be collected by different laborato- 
ries will make large-scale data analysis extremely dif- 
ficult. To help circumvent these future problems, a 
set of standards to be included on all platforms 
should be established. These standards would facili- 
tate data entry into the national database and serve 
as reference points for cross-platform and inter-labo- 
ratory data analysis. 

Many issues remain to be resolved, but it is clear 
that new molecular techniques such as microarray 
hybridization will have a dramatic impact on toxicol- 
ogy research. In the future, the information gathered 
from micToarray-based hybridization experiments will 
form the basis for an improved method to assess the 
impact of chemicals on human and environmental 
health. 
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Abstract 

Recent progress in genomics and proteomics technologies has created a unique opportunity to significantly impact 
the pharmaceutical drug development processes. The perception that cells and whole organisms express specific 
inducible responses to stimuli such as drug treatment implies that unique expression patterns, molecular fingerprints, 
indicative of a drug's efficacy and potential toxicity are accessible. The integration into state-of-the-art toxicology of 
assays allowing one to profile treatment-related changes in gene expression patterns promises new insights into 
mechanisms of drug action and toxicity. The benefits will be improved lead selection, and optimized monitoring of 
drug efficacy and safety in pre-clinical and clinical studies based on biologically relevant tissue and surrogate markers. 
© 2000 Elsevier Science Ireland Ltd. All rights reserved. 
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1. Introduction 

The majority of drugs act by binding to protein 
targets, most to known proteins representing en- 
zymes, receptors and channels, resulting in effects 
such as enzyme inhibition and impairment of 
signal transduction. The treatment-induced per- 
turbations provoke feedback reactions aiming to 
compensate for the stimulus, which almost always 
are associated with signals to the nucleus, result- 
ing in altered gene expression. Such gene expres- 
sion regulations account for both the 
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pharmacological action and the toxicity of a drug 
and can be visualized by either global mRNA or 
global protein expression profiling. Hence, for 
each individual drug, a characteristic gene regula- 
tion pattern, its molecular fingerprint, exists 
which bears valuable information on its mode of 
action and its mechanism of toxicity. 

Gene expression is a multistep process that 
results in an active protein (Fig. 1). There exist 
numerous regulation systems that exert control at 
and after the transcription and the translation 
step. Genomics, by definition, encompasses the 
quantitative analysis of transcripts at the mRNA 
level, while the aim of proteomics is to quantify 
gene expression further down-stream, creating a 
snapshot of gene regulation closer to ultimate cell 
function control. 
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2. Global mRNA profiling 

Expression data at the mRNA level can be 
produced using a set of different technologies 
such as DNA microarrays, reverse transcript 
imaging, amplified fragment length polymorphism 
(AFLP), serial analysis of gene expression 
(SAGE) and others. Currently, DNA microarrays 
are very popular and promise a great potential. 
On a typical array, each gene of interest is repre- 
sented either by a long DNA fragment (200-2400 
bp) typically generated by polymerase chain reac- 
tion (PCR) and spotted on a suitable substrate 
using robotics (Schena et ah, 1995; Shalon et al., 
1996) or by several short oligonucleotides (20-30 
bp) synthesized directly onto a solid support using 
photolabile nucleotide chemistry (Fodor et al., 
1991; Chee et al., 1996). From control and treated 
tissues, total RNA or mRNA is isolated and 
reverse transcribed in the presence of radioactive 
or fluorescent labeled nucleotides, and the labeled 
probes are then hybridized to the arrays. The 
intensity of the array signal is measured for each 
gene transcript by either autoradiography or laser 
scanning confocal microscopy. The ratio between 
the signals of control and treated samples reflect 
the relative drug-induced change in transcript 
abundance. 



3. Global protein profiling 

Global quantitative expression analysis at the 
protein level is currently restricted to the use of 
two-dimensional gel electrophoresis. This tech- 
nique combines separation of tissue proteins by 
isoelectric focusing in the first dimension and by 
sodium dodecyl sulfate slab gel electrophoresis- 
based molecular weight separation on the second, 
orthogonal dimension (Anderson et al., 1991). 
The product is a rectangular pattern of protein 
spots that are typically revealed by Coomassie 
Blue, silver or fluorescent staining (Fig. 2). 
Protein spots are identified by mass spectrometry 
following generation of peptide mass fingerprints 
(Mann et al., 1993) and sequence tags (Wilkins et 
al., 1996). Similar to the mRNA approach, the 
ratio between the optical density of spots from 
control and treated samples are compared to 
search for treatment-related changes. 



4. Expression data analysis 

Bioinformatics forms a key element required to 
organize, analyze and store expression data from 
either source, the mRNA or the protein level. The 
overall objective, once a mass of high-quality 
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Fig. J. Production of an active protein is a multistep process in which numerous regulation systems exert control at various stages 
of expression. Molecular fingerprints of drugs can be visualized through expression profiling at the mRNA level (genomics) using 
a variety of technologies and at the protein level (proteomics) using two-dimensional gel electrophoresis. 
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Fig. 2. Computerized representation of a Coomassie Blue stained two-dimensional gel electrophoresis pattern of Fischer F344 rat 
liver homogenate. 



quantitative expression data has been collected, is 
to visualize complex patterns of gene expression 
changes, to detect pathways and sets of genes 
tightly correlated with treatment efficacy and toxi- 
city, and to compare the effects of different sets of 
treatment (Anderson et al., 1996). As the drug 
effect database is growing, one may detect similar- 
ities and differences between the molecular finger- 
prints produced by various drugs, information 
that may be crucial to make a decision whether to 
refocus or extend the therapeutic spectrum of a 
drug candidate. 



5. Comparison of global mRNA and protein 
expression profiling 

There are several synergies and overlaps of data 
obtained by mRNA and protein expression analy- 
sis. Low abundant transcripts may not be easily 
quantified at the protein level using standard two- 
dimensional gel electrophoresis analysis and their 
detection may require prefractionation of sam- 
ples. The expression of such genes may be prefer- 
ably quantified at the mRNA level using 
techniques allowing PCR-mediated target amplifi- 
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cation. Tissue biopsy samples typically yield good 
quality of both mRNA and proteins; however, the 
quality of mRNA isolated from body fluids is 
often poor due to the faster degradation of 
mRNA when compared with proteins. RNA sam- 
ples from body fluids such as serum or urine are 
often not very 'meaningful', and secreted proteins 
are likely more reliable surrogate markers for 
treatment efficacy and safety. Detection of post- 
translational modifications, events often related to 
function or nonfunction of a protein, is restricted 
to protein expression analysis and rarely can be 
predicted by mRNA profiling. Information on 
subcellular localization and translocation of 
proteins has to be acquired at the level of the 
protein in combination with sample prefractiona- 
tion procedures. The growing evidence of a poor 
correlation between mRNA and protein abun- 
dance (Anderson and Seilhamer, 1997) further 
suggests that the two approaches, mRNA and 
protein profiling, are complementary and should 
be applied in parallel. 



6. Expression profiling and drug development 

Understanding the mechanisms of action and 
toxicity, and being able to monitor treatment 
efficacy and safety during trials is crucial for the 
successful development of a drug. Mechanistic 
insights are essential for the interpretation of drug 
effects and enhance the chances of recognizing 
potential species specificities contributing to an 
improved risk profile in humans (Richardson et 
al., 1993; Steiner et al., 1996b; Aicher et al„ 1998). 
The value of expression profiling further increases 
when links between treatment-induced expression 
profiles and specific pharmacological and toxic 
endpoints are established (Anderson et al., 1991, 
1995, 1996; Steiner et al. 1996a). Changes in gene 
expression are known to precede the manifesta- 
tion of morphological alterations, giving expres- 
sion profiling a great potential for early 
compound screening, enabling one to select drug 
candidates with wide therapeutic windows 
reflected by molecular fingerprints indicative of 
high pharmacological potency and low toxicity 
(Arce et al., 1998). In later phases of drug devel- 



opment, surrogate markers of treatment efficacy 
and toxicity can be applied to optimize the moni- 
toring of pre-clinical and clinical studies (Doherty 
et al., 1998). 



7. Perspectives 

The basic methodology of safety evaluation has 
changed little during the past decades. Toxicity in 
laboratory animals has been evaluated primarily 
by using hematological, clinical chemistry and 
histological parameters as indicators of organ 
damage. The rapid progress in genomics and pro- 
teomics technologies creates a unique opportunity 
to dramatically improve the predictive power of 
safety assessment and to accelerate the drug devel- 
opment process. Application of gene and protein 
expression profiling promises to improve lead se- 
lection, resulting in the development of drug can- 
didates with higher efficacy and lower toxicity. 
The identification of biologically relevant surro- 
gate markers correlated with treatment efficacy 
and safety bears a great potential to optimize the 
monitoring of pre-clinical and clinical trails. 
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Application of DNA Arrays to Toxicology 
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DNA array technology makes it possible to rapidly genotype individuals or quantify the expression 
of thousands of genes on a single filter or glass slide, and holds enormous potential in toxicologic 
applications. This potential led to a U.S. Environmental Protection Agency-sponsored workshop 
titled "Application of Microarrays to Toxicology" on 7-8 January 1999 in Research Triangle Park, 
North Carolina. In addition to providing state-of-the-art information on the application of DNA or 
gene microarrays. the workshop catalyzed the formation of several collaborations, committees, and 
user's groups throughout the Research Triangle Park area and beyond. Potential application of 
microarrays to toxicologic research and risk assessment include genome-wide expression analyses to 
identify gene-expression networks and toxicant-specific signatures that can be used to define mode 
of action, for exposure assessment, and for environmental monitoring. Arrays may also prove useful 
for monitoring genetic variability and its relationship to toxicant susceptibility in human popula- 
tions. Key words: DNA arrays, gene arrays, microarrays, toxicology. Environ Health Perspeer 
107:681-685 (1999). [Online 6 July 1999] 
http:/fchpnetl.nuhs.nih.gov/doa/1999/107p68h685rockctt/ab 



Decoding the genetic blueprint is a dream that 
otters manifold returns in terms of understand- 
ing how organisms develop and function in an 
often hostile environment. With the rapid 
advances in molecular biology over the last 30 
years, the dream has come a step closer to reali- 
ty. Molecular biologists now have the ability to 
elucidate the composition of any genome. 
Indeed, almost 20 genomes have already been 
sequenced and more than 60 are currently 
under way. Foremost among these is the 
Human Genome Mapping Protect. However, 
the genomes of a number or commonly used 
laboratory species are also under intensive 
investigation, including yeast, Arabidopsis* 
majze, rice, zebra fish, mouse, rat. and dog. It 
is widely expected that the completion of such 
programs will facilitate the development of 
manv powerful new techniques and approach- 
es to diagnosing ana treating geneucaHv and 
enMronmentaOy induced riivasrs which afflict 
mankind. However, the vast amount of data 
being generated by genome mapping will 
require new high-throughput technologies to 
investigate the function of the millions of new 
genes that are being reported. Among the most 
widely heralded of the new functional 
genomics technologies are DNA arrays, which 
represent perhaps the most anticipated new 
molecular biology technique since polymerase 
chain reaction (PCR). 

Arrays enable the study of literally thou- 
sands of genes in a single experiment. The 
potential importance of arrays is enormous and 
has been highlighted by the recent publication 
of an entire Nature Genetics supplement dedi- 
cated to the technology (/). Despite this huge 
surge of interest, DNA arrays are still little used 
and largely unproven. as demonstrated by the 
high ratio of review and press articles to actual 
data papers. Even so, the potential they offer 



has driven venture capitalists into a frenzy of 
investment and many new companies are 
springing up to claim a share of this rapidly 
developing market. 

The U.S. Environmental Protection 
Agency (EPA) is interested in applying DNA 
array technology to ongoing toxicologic stud- 
ies. To learn more about the current state of 
the technology, the Reproductive Toxicology 
Division (RTD) of the National Health and 
Environmental Effects Research Laboratory 
(NHEERL; Research Triangle Park, NC) 
hosted a workshop on "Application of 
Microarrays to Toxicology" on 7-8 January 
1999 in Research Triangle Park, North 
Carolina. The workshop was organized by 
David Dix, Robert Kavlock. and John Rockett 
of the RTD/NHEERL. Twenry-rwo intra- 
mural and extramural scientists from govern- 
ment, acaaemia. ana industry shared inrorma- 
tion. data, and opinions on the current and 
future applications for this exciting new tech- 
nology. The workshop had more than 1 50 
attendees, including researchers, students, and 
— administrators from the EPA, thcNational 
Institute of Environmental Health Sciences 
(NIEHS), and a number of other establish- 
ments from Research Triangle Park and 
beyond. Presentations ranged from the tech- 
nology behind array production through the 
sharing of actual experimental data and projec- 
tions on the future importance and applica- 
tions of arrays. The information contained in 
the workshop presentations should provide aid 
and insight into arrays in general and their 
application to toxicology in particular. 

Array Elements 

In the context of molecular biology, the word 
"array" is normally used to refer to a series of 
DNA r protein elements firmly attached in 



a regular pattern to some kind ot supportive 
medium. DNA array is often used inter- 
changeable - with gene array or microarray. 
Although not formally denned, microarrav is 
generally used to describe the higher density 
arrays typically printed on glass chips. The 
DNA elements that make up DNA arrays 
can be oligonucleotides, partial gene 
sequences, or rull-iength cDNAs. Companies 
offering pre-made arrays that contain less 
than full-lengrh clones normally use regions 
of the genes which are specific to that gene to 
prevent false positives arising through cross- 
hybridization. Sequence verification of 
cDNA clone identity is necessary because of 
errors in identifying specific clones from 
cDNA libraries and databases. Premade 
DNA arrays printed on membranes are cur- 
rently or imminently available for human, 
mouse, and rat. In most cases they contain 
DNA sequences representing several thou- 
sand different sequence clusters or genes as 
delineated through the National Center for 
Biotechnology Information UniGene Project 
(2). Many of these different UniGene dusters 
(putative genes) are represented only by 
expressed sequence tags (ESTs). 

Array Printing 

Arrays are typically printed on one of rw 
types of support matrix. Nylon membranes 
are used by most off-the-shelf array providers 
such as Clontech Laboratories. Inc. 
(Palo Alto, CA). Genome Systems. Inc. (St. 
Louis, MO), and Research Genetics, Inc. 
fHuntsville. AL). Microarravs such as those 
produced by Arfymetrix. Inc. i Santa Clara. 
CA). lncyte Pharmaceuticals. Inc. (Palo Alto. 
CA). and many do-it-yourself (DIY) arraying 
groups use glass waters or slides. Although 
standard microscope slides may be used, they 
must be preprepared to facilitate sticking 
of the DNA to the glass. Several different 
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coatings have been successfully used, includ- 
ing silane and lysine. The coating of slides 
can easily be carried out in the laboratory, 
but many prefer the convenience of precoated 
slides available from suppliers. 

Once the support matrix has been pre- 
pared, the DNA elements can be applied by 
several methods. Asymetrix, Inc., has devel- 
oped a unique photolithographic technology 
for attaching oligonucleotides to glass wafers. 
More commonly, DNA is applied by cither 
noncontact or contact printing. Noncontact 
printers can use thermal, solenoid, or piezoelec- 
tric technology to spray aliquots of solution 
onto the support matrix and may be used to 
produce slide or membrane-based arravs. 
Cartesian Technologies, Inc. (Irvine, CA) has 
developed nQUAD technology for use in its 
PixSvs printers. The system couples a syringe 
pump with the microsolcnoid valve, a combi- 
nation chat provides rapid quantitative dispens- 
ing of nanoliter volumes (down to 4.2 nL) over 
a variable volume range. A different approach 
to noncontact printing uses a solid pin and ring 
combination (Genetic iMicroSystems, Inc., 
Woburn, MA). This system (Figure 1 J allows a 
broader range of sample, including cell suspen- 
sions and particulates, because the printing 
head cannot be blocked up in the same wav as 
a spray nozzle. Fluid transfer is controlled in 
this system primarily by the pin dimensions 
and the force of deposition, although the 
nature of the support matrix and the sample 
will also affect transfer to some degree. 

In contact printing, the pin head is dipped 
in the sample and then touched to the support 
matrix to deposit a small aliquot. Split pins 
were one of the first contact-printing devices 
to be reported and are the suggested format 
for DIY arrayers, as described by Brown (5). 
Split pins are small metal pins with a precise 
groove cut vertically in the middle of the pin 
tip. In this system. 1—48 spiit pins are posi- 
tioned in the pin-head The split pins work bv 
simple capillary action, not unlike a fountain 
pen — when the pin heads are dipped in the 
sample, liquid is drawn into the pin groove. A 
small (fixed) volume is then deposited each 
time the split pins are gently touched to 
the support matrix. Sample (100-500 pL 
depending on a variety of parameters) can be 
deposited on multiple slides before refilling is 
required, and array densities of > 2,500 
spots/an 2 may be produced. The deposit vol- 
ume depends on the split size, sample fluidi- 
ty, and the speed of printing. Split pins are 
relatively simple to produce and can be made 
in-house if a suitable machine shop is avail- 
able. Alternatively, they can be obtained 
directly from companies such as TeleChem 
International. Inc. (Sunnyvale, CA). 

Irrespective of their source, printers 
should be run through a preprint sequence 
prior to producing the actual experimental 
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arrays; the first 100 or so spots of a new run 
tend to be somewhat variable. Factors effect- 
ing spot reproducibility include slide treat- 
ment homogeneity, sampie differences, and 
instrument errors. Other factors that come 
into play include clean ejection of the drop 
and clogging (nQUAD printing) and 
mechanical variations and long-term alter- 
ation in print-head surrace of solid and split 
pins. However, with careful preparation it is 
possible to get a coefficient of variance for 
spot reproducibility below 10%. 

One potential printing problem is sample 
carryover. Repeated washing, blotting, and 
drying (vacuum) of print pins between samples 
is normally effective at reducing sample carry- 
over to negligible amounts. Printing should 
also be carried out in a controlled environ- 
ment. Humidified chambers are available in 
which to place printers. These help prevent 
dust contamination and produce a uniform 
drying rate, which is important in determining 
spot size, quality, and reproducibility. 

In summary, although several printing 
technologies are available, none are par- 
ticularly outstanding and the bottom line 
is that they arc still in a relatively earlv stage 
of evolution. 

Array Hybridization 

The hybridization protocol is, practicallv 
speaking, relatively straightforward and those 
with previous experience in blotting should 
have little difficulty. Array hybridizations 
are, in essence, reverse Southern/Northern 
blots — instead of applying a labeled probe to 
the target population of DNA/RNA. the 
labeled population is applied to the probds). 
With membrane-based arrays, the control and 
treated mRNA populations are normally con- 
vened to cDNA and labeled with isotope (e.g., 
33 P) in the process. These labeled populations 
are tnen hybridized independently to parallel 
or senai arrays and the hybridization sicnai is 
detected with a phosponmager. A less com- 
monly used alternative to radioactive probes is 
enzymatic detection. The probe may be 
biotinylaied, haptenylated»_or have alkaline 
phosphatase/horseradish peroxidase attached. 
Hybridization is detected by enzymatic reac- 
tion yielding a color reaction (4). Differences 
in hybridization signals can be detected by eye 
or, more accurately, with the help of digital 
imaging and commercially available software. 
The labeling of the test populations for slide- 
based microarrays uses a slightly different 
approach. The probe typically consists of two 
samples of polyA* RNA (usually from a treated 
and a control population) that are converted jo 
cDNA; in the process each is labeled with a 
different fluor. The independently labeled 
probes are then mixed together and hybridized 
to a single microarray slide and the resulting 
combined fluorescent signal is scanned. After 

Vnlumo 1 A7 Mi imhar O A.. 




Wtn com* mm g 
umptt solution 




Figure 1. Genetic Microsystems (Woburn. MA) pin 
ring system for printing arrays. The pin ring com- 
bination consists of a circular open ring oriented 
parallel to the sample solution, with a vertical pin 
centered over the ring. When the ring is dipped 
into a solution and lifted, it withdraws an aliquot 
of sample held by surface tension. To spot the 
sample, the pin is driven down through the ring 
and a portion of the solution is transferred to the 
bottom of the pin. The pin continues to move 
downward until the pendant drop of solution 
makes contact with the underlying surface. The 
pin is then lifted, and gravity and surface tension 
cause deposition of the spot onto the array. 
Figure from Flowers et al. { 14), with permission 
from Genetic Microsystems. 

normalization, it is possible to determine the 
ratio of fluorescent signals from a single 
hybridization of a slide-based microarrav. 

cDNA derived from control and treated 
populations of RNA is most commonly 
hybridized to arrays, although subtractive 
hybridization or differential display reach ns 
may also be used. Fluorophore- or radi la- 
beied nucieouaes are directly incorporated 
into the cDNA in the process of converting 
RNA to cDNA. Alternatively, 3 end-labeled 
primers mav be used for cDNA synthesis. 
These are labeled with a fluorophore for 
direct visualization of the hybridized array. 
Alternatively, biotin or a hapten may be 
attached to the primer, in which case flu r- 
labeled streptavidin or antibody must be 
applied before a signal can be generated. The 
most commonly used fluorophores at present 
are cyanine (Cy)3 and Cy5 (Amersham 
Pharmacia Biotech AB, Uppsala, Sweden). 
However the relative expense of these fluo- 
rescent conjugates has driven a search for 
cheaper alternatives. Fluorescein, rhodamine, 
and Texas red have alt been used, and 
companies such as Molecular Probes, Inc. 
(Eugene, OR) are developing a series f 
labeled nucleotides with a wide range of exci- 
tation and emission spectra which may prove 
to function as well as the Cy dyes. 
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Analysis Of DNA Micr arrays Table I. Advantages and disadvantages of different microarray scanning systems 



Membrane-based arrays are normally analyzed 
on film or with a phosphorimager, whereas 
chip-based arrays require more specialized scan- 
ning devices. These can be divided into three 
main groups: the charge-coupled device camera 
systems, the nonconrocal laser scanners, and the 
confocal laser scanners. The advantages and dis- 
advantages of each system are listed in Table 1 . 

Because a typical spot on a microarray can 
contain > 1 0 8 molecules, it is dear that a large 
variation in signal strength may occur. 
Current scanners cannot work across this 
many orders of magnitude (4 or 5 is more typ- 
ical). However, the scanning parameters can 
normally be adjusted to collect more or less 
signal, such that two or three scans of the same 
array should permit the detection of rare and 
abundant genes. 

When a microarray is scanned, the fluores- 
cent images are captured by software normally 
included with the scanner. Several commercial 
suppliers provide additional software tor quan- 
tifying array images, but the software tools are 
constantly evolving to meet the developing 
needs of researchers, and it is prudent to 
define one s own needs and clarify the exact 
capabilities of the software before its purchase. 
Issues that should be considered include the 
following: 

• Qui the software locate offset spots? 

• Can it quantitate across irregular hybridiza- 
tion signals? 

• Can the arrayed genes be programmed in for 
easy identification and location? 

• Can the software connect via the Internet to 
databases containing runner information on 
the gene(s) of interest? 

One of the key issues raised at the work- 
shop was the sensitivity of microarray technol- 
ogy. Experiments by General Scanning. Inc. 
^X atertown. MA)* have shown mar by using 
the Cy dyes and their scanner, signal can be 
detected down to levels of < 1 fluor molecule 
per square micrometer, which translates to 
detecting a rare message at approximately one 
copy per cell or less. 

Array Applications 

Although arrays are an emerging technology 
certain to undergo improvement and 
alteration, they have already been applied use- 
fully to a number of model systems. Arrays are 
at their most powerful when they contain the 
entire gen me f the species they are being 
used to study. For this reason, they have strong 
support among researchers utilizing yeast and 
Caenorhabditis eUgans (5). The genomes of 
both of these species have been sequenced and, 
in the case of yeast, deposited onto arrays for 
examination of gene expression (6,7). With 
both of these species, it is relatively easy to 
perturb individual gene expression. Indeed, C 
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elegans knockouts can be made simply by 
soaking the worms in an antisense solution of 
the gene to be knocked out. 

By a process of systematic gene disrup- 
tion, it is now possible to examine the cause 
and effect relationships between different 
genes in these simple organisms. This kind of 
approach should help elucidate biochemical 
pathways and genetic control processes, 
deconvolute polygenic interactions, and 
define the architecture of the cellular network. 
A simple case study of how this can be 
achieved was presented by Butow [University 
of Texas Southwestern Medical Center, 
Dallas, TX (Figure 2)]. Although it is the 
phenotypic result of a single gene knockout 
that is being examined, the effect of such 
perturbation will almost always be polygenic. 
Polygenic interactions will become increasing- 
ly important as researchers begin to move 
away from single gene systems when examin- 
ing the nature of toxicologic responses to 
external stimuli. This is especially important 
in toxicology because the phenorypc pro- 
duced by a given environmental insult is 
never the result of the action of a single gene; 
rather, it is a complex interaction of one or 
multiple cellular pathways. Phenomena such 
as quantitative trait (the continuous variation 
of phenorypc;, episasis (the erred or alleles of 
one or more genes on the expression or other 
genes), and penetrance (proportion of indi- 
viduals of a given genotype that display a par- 
ticular phenorype) will become increasingly 
evident and important as toxicologists push 
toward the ultimate goal of matching the 
responses of individuals to different 
environmental stimuli. 

Analysis of the rxanscriptome (the expres- 
sion level of all the genes in a given cell popula- 
tion) was a use of arrays addressed by several 
speakers. Unfbrrunately, current gene nomen- 
clature is ften confusing in that single genes 
are allocated multiple names (usually as a result 
of independent discovery by different laborato- 
ries), and there was a call for standardization of 
gene nomenclature. Nevertheless, once a rxan- 
scriptome has been assembled it can then be 
transferred onto arrays and used to screen any 
chosen system. The EPA MicroArray 
Consortium (EPAMAC) is assembling testes 



transcriptomes for human, rat. and mouse. In a 
slighdy different approach, Nuwaysir et al {8) 
describes how the NIEHS assembled what is 
effectively a "toxicoiogical transcriptome" — a 
library of human and mouse genes that have 
previously been proven or implicated in 
responses to toxicologic insults. Clontech 
Laboratories. Inc. (Palo Alto. CA). has begun a 
similar process by developing stress/ toxicology 
filter arrays of rat, mouse, and human genes. 
Thus, rather than being tissue or cell specific, 
these stress/ toxicology- arra\-s can be used across 
a variety of model systems to look tor alter- 
ations in the expression of toxicologically 
important genes and derine the new field of 
toxicogenomics. The potential to identify toxi- 
cant families based on tissue- or cell-specific 
gene expression could revolutionize drug test- 
ing. These molecular signatures or fingerprints 
could not only point to the possible 
toxicity/carcinogenicity of newly discovered 
compounds (Figure 3), but also aid in elucidat- 
ing their mechanism of action through identifi- 
cation of gene expression networks. By exten- 
sion, such signatures could provide easily iden- 
tifiable biomarkers to assess the degree, time, 
and nature of exposure. 

DNA arravs are primarily a tool for exam- 
ining differential gene expression in a given 
model. In this context thev are ici e r r e d to as 
dosed systems because they lack the ability of 
other cbfferenrial expression technologies, e.g., 
differential display and subtractive hybridiza- 
tion, to detect previously unknown genes not 
present on the array. This would appear to 
limit the power of DNA arrays to the imagina- 
tions and preconceptions of the researcher in 
selecting genes previously characterized and 
thought to be involved in the model system. 
However, the various genome sequencing pro- 
jects have created a new category of 
sequence — the EST — that has partially molli- 
fied this deficiency. ESTs are cDNAs expressed 
in a given tissue that, although they may share 
some degree of sequence similarity to previ us* 
ly characterized genes, have not been assigned 
specific genetic identity. By incorporating EST 
clones into an array, it is possible to monitor 
the expression of these unknown genes. This 
can enable the identification of previously 
un character! zed genes that may have biologic 



significance in the model system. Filter arrav$ 
from Research Genetics and slide arravs rrom 
I note Pharmaceuticals both incorporate laree 
numbers of ESTs from a variety or* species. 

A further use of miexoarrays is the identifi- 
cation of single nucleotide polymorphisms 
(SNPsi. These genomic variations are abun- 
dam — they occur approximately every 1 kb or 
so— and are the basis of restriction fragment 
. length polymorphism analysis used in forensic 
analysis. Arrymetrix. Inc. designed chips that 
contain multiple repears of the same gene 
sequence. Each position is present with all rbur 
possible bases. Arter the hybridization of the 
sample, the degree of hybridization to the dif- 
ferent sequences can be measured and the exact 
sequence of the target gene deduced. SNPs are 
thought to be or vital importance in drug 
metabolism and toxicology. For example, sin- 
gle base differences in the regulatory reeion or 
active sire of some genes can account for huee 
differences in the activity of that gene. Such 
SNPs are thought to explain why some people 
are able to metabolize certain xenobiotics bet- 
ter than others. Thus, arrays provide a further 
tool for the toxicologist investigatine the 
nature of susceptible subpopulations and toxi- 
cologic response. 

There are still many wrinkles to be ironed 
out before arravs become a standard tool for 
toxicologists. The main issues raised at the 
workshop by those with hands-on experience 
were the following: 

• Expense: the cost of purchasing/con tracting 
this technology is still too great for manv 
individual laboratories. 
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figure 2. Potential effects of gene knockout within 
positively and negatively regulated gene expression 
networks. i } is limiting in wild type for expression of 
^ 14) A simple, two-component, linear regulatory 
network operating on gene where /, is a positive 
effector of ^ and j n is either a positive or negative 
effector of / t . This network could be deduced by 
examining the consequence of (5) deleting / n on the 
expression of /, and where the expression of L 
would be decreased or increased depending on 
whether j n was a positive or negative regulator. 
These and other connected components of even 
greater complexity could be revealed by genome- 
wide expression analysis. From Butow 1/5). 



• Clones: the logisricsof identirying. obtaining, 
and maintaining a set of nonredundant. non- 
contaminated, sequence-verirled. species/ ceil-' 
rissue/rleld-specific clones. 

• Use of inbred strains: where whoie-oreanism 
models are being used, the use of inbred 
strains is important to reduce the potentially 
confusing effects of the individual variation 
typically seen in outbred populations. 

• Probe: me need for relatively large amounts 
of RNA, which limits the type" of sample 
(e.g., biopsy) chat can be used. .Also, different 
RNA extraction methods can give different 
results. 

• Specificity: the ability to discriminate accu- 
rately between closely related genes (e.g., the 
cytochrome p450 family) and splice variants. 

• Quantitation: the quantitation of gene 
expression using gene arrays is still open to 
debate. One reason for this is the different 
incorporation of the labeling dyes. However, 
the main difficulty lies in knowing what to 
normalize against. One option is ^include a 
large number of so-called housekeeping *enes 
in the array. However, the expression of these 
genes often change depending on the tissue 
and the toxicant, so it is necessary to charac- 
terize the expression of these genes in the 
model system before utilizing them. This is 
clearly nor a viable option when screening 
multiple new compounds. A second option 
is to include on the array genes from a nonre- 
lated species (e.g., a plant gene on an animal 
array) and to spike the probe with synthetic 
RNA(s) complementary to the gene{s). 

• Reproducibility: this is sometimes question- 
able, and a figure of approximately two or 
three repeats was used as the minimum num- 
ber required to confirm initial findings. 
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.Again, however, most people jjvocjil-j :r.i- 
use or Northern biots or reverse trinwnav 
PCR to connrm nnaincs. 

• Sensitivity: concerns were voiced about the 
"number of target molecules that must be pre- 
sent in a sampie for them to be detected on 
the arrav. 

• Efficiency: reproducible identification of 1.5- 
to 2-fold differences in expression was report- 
ed, although the number of genes that 
undergo this level of change and remain 
undetected is open to debate." It is important 
that this level of detection be ultimately 
achieved because it is commonly perceived 
that some important transcription factors 
and their regulators respond at such low lev- 
els. In most cases, 5- to Wold was the mini- 
mum change that most were happv to 
accept. 

• Bioinrormatics: perhaps the greatest concern 
was how to accurately interpret the data with 
the greatest accuracy and efficiency. The 
biggest headache is trying to identity- net- 
works of gene expression that are common to 
different treatments or doses. The amount of 
data from a single experiment is huge, it mav 
be that, in the ruture. several groups individ- 
ually equipped with specialized software algo- 
rithms tor studying their favorite genes or 
gene systems will be able to share the same 
hybridized chips. Thus, arrays could usher in 
a new perspective on collaboration and the 
sharing of data. 

EPAMAC 

Perhaps the main reason most scientists are 
unable to use array technology is the high cost 
involved, whether buying off-the-shelf mem- 
branes, using contract printing services, or 
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producing chips in-house. In view of this, 
researchers at the RTD/NHEERL initiated 
the EPAMAC. This consortium brings 
toeecher scientists from the EPA and a num- 
' ber of extramural labs with the aim of devel- 
oping microarrav capability through the shar- 
ing of resources and data. EPAMAC 
researchers are primarily interested in the 
developmental and toxicologic changes seen 
in testicular and breast tissue, and a portion 
of the workshop was set aside for EPAMAC 
members to share their ideas on how the 
experimental application of microarrays could 
facilitate their research. One of the central 
areas of interest to EPAMAC members is the 
effect of xenobiotics on male fertility and 
reproductive health. Of greatest concern is 
the effect of exposure during critical periods 
of development and germ cell differentiation 
(5?), and how this may compromise sperm 
counts and qualiry following sexual matura- 
tion [JO). As well as spermatogenic tissue, 
there is also interest in how residual mRNA 
found in mature sperm ( 1 1) could be used as 
an indicator of previous xenobiotic effects (it 
is easier to obtain a semen sample than a tes- 
ticular biopsy). Arrays will be used to examine 
and compare the effect of exposure to heat 
and chemicals in testicular and cpididymal 
gene expression profiles, with the aim of 
establishing relationships/associations 
between changes in developmental landmarks 
and the effects on sperm count and qualiry. 
Cluster, pattern, and other analysis of such 
data should help identify hidden relationships 
berween genes that may reveal potential 
mechanisms of action and uncover roles for 
genes with unknown functions. 

Summary 

The rull impact of DNA arrays may not be 
^een ror several vears. but the interest shown at 
:his repanai workshop indicates the high levd 
or interest that the)' roster. Apart from educat- 
ing and advertising the various technologies in 
this field, this workshop brought together a 
number of researchers from the Research 
Triangle Park area who are already using DNA 
arrays. The interest in sharing ideas and experi- 
ences led to the initiation of a Triangle array 
user's group. 
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Array technology is still in its infancy. This 
means that the hardware is still improving and 
there is no current consensus tor standard pro- 
cedures, quantitation, and inter pretax ton. 
Consistency in spotting and scanning arrays is 
not vet optimized, and this is one of the most 
critical requirements of any experiment. ln_ 
addition, one of the dark regions of array tech- 
nology — strife in the courts over who owns 
what portions of it — has runner muddled the 
future and is a potential barrier toward the 
development of consensus procedures. 

Perhaps the greatest hurdle tor the applica- 
tion of arrays is the actual interpretation of 
data. No specialists in bioinformatics attended 
the workshop, largely because they are rare and 
because as yet no one seems clear on the best 
method of approaching data analysis and inter- 
pretation. Cross-referencing results from mul- 
tiple experiments (time, dose, repeats, different 
animals, different species) to identify common- 
ly expressed genes is a great challenge. In most 
cases, we are still a long way from understand- 
ing how the expression of gene X is related to 
the expression of gene V. and ordering gene 
expression to delineate causal relationships. 

To the ordinary scientist in the typical lab- 
oratory, however, che most immediate prob- 
lem is a lack of affordable instrumentation. 
One can purchase premade membranes at 
relatively affordable prices. Although these 
may be useful in identifying individual genes 
to pursue in more detail using other methods, 
the numbers that would be required for even a 
small routine toxicology experiment prohibit 
this as a truly viable approach. For the lexicol- 
ogist, there is a need to earn- out multiple 
experiments — dose responses, time curves, 
multiple animals, and repeats. Class-based 
DNA arrays are most attractive in this context 
because they can be prepared in large batches 
from the same DNA source and accommo- 
date control and treated samples on the same 
chip. Another problem with current off-the- 
shelf arrays is that they often do not contain 
one or more of the particular genes a group is 
interested in. One alternative is to obtain 
and/or produce a set of custom clones and 
have contract printing of membranes or slides 
carried out by a company such as Genomic 
Solutions. Inc. (Ann Arbor. MI). This approach 
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is less expensive than ijvtn.p ou: jar::-. 
one s own entire svstem. airhouch at >orru* 
point it might make economic >en$e io pr:r.: 
one s own arrays. 

Finallv. DNA arrays are currently a team 
efton. Thev are a technology that uses j wide 
.-range of skills including engineering, statistics, 
molecular biology, chemistry, and bioinror- 
matics. Because most individuals are skilled in 
only one or perhaps two of rhese areas, it 
appears that success with arravs may be best 
expected by teams or collaborators consisting 
of individuals having each of these skills. 

Those considering array applications mav 
be amused or goaded on by the following 
quote from Forrum magazine t IJi: 

Microprocessors have reshaped our economv. 
spawned vast forrunes and changed the wav we live. 
Gene chips could be even bigger. 

Although this comment may have been 
designed to excite the imagination rather than 
accurately reflect the truth, it is fair to say that 
the age of functional genomics is upon us. 
DNA arrays look set to be an important tool in 
this new age of biotechnology and will likely 
contribute answers to some of toxicology's 
most fundamental questions. 

References and Notes 

1. The chioptng forecast Nat Genet 21ISuppl 1)3-40(1999). 

2. National Center tor Biotecnnoiogy Information. Tha 
Unigene System. Available: www.ncbi.nlm.nih.gov/ 
Schuier/UniGene letted 22 March 1999). 

3. Brown P0 The Brown Lab. Available: hnp:// 
cmgm.Stantord.edu/oorown f cited 22 March 1999). 

4 Chen JJ. Wu R. Yang PC, Huang JY. Shar YP. Han MH. 
Kao WC. Lee PJ. Chm TF. Chang F. et aJ. Profiting expres- 
$ion patterns and isolating differentially expressed gents 
by cONA microarrav system with cotonmttry detection. 
Genomics 51:313-324119981. 

5. Ward S DNA Microarrav Technology to Identify Gents 
Controlling Spermatogenesis. Available: www.mcb. 
anzona.edu/wardlab/nucit>arr»v.taml |crttd 22 March 1999L 

6 Marton MJ. OeRisi JL Bennett HA. lytr VR Meyer MR. 
flooerts CJ. Stougmon R, Burcnard J. SUtft 0. Oai H. at 
at. Drug taroet vauaaron a no ittei iu l tiauun ol secondary 
orug target enects using 0NA microarrays. Nat Mod 
4:1293-1301 (1998). 

7. Brown PQ. The Full Yeast Genome on a Chip. Available: 
hTTo://cmgm.sTantoro. tOuypbrowrvveastchip.html (cited 
22 March 1999). 

8. Nuwaystr EF. B inner M. Trent J. Barrett JC, Afshari CA. 
Microarrays and toxicology: the advent of toiicoge- 
nomics. Mol Carcmog 24«):1 53-159 0999). 

9. Hecht NB. Molecular mechanisms of malt germ cod dif- 
ferentiation. Bioessavs 20555-561 (1996). 

10. Zacnarewski Tfl. Timothy R. Zaenartwski. Aveilablr 
wwwi«ch.rnsu.tduytacu*V/i»char Jim Ictsd 22 March W9L 

11. Kramer JA. Krawttl SA. RNA m spermatozoa: implica- 
tions for the alternative haploid genome. Mol Hum 
Reprod 3:473-478 (10M)* 

12. Stipp D. Gene chip breakthrouQh. Fortune. March 
3156-73(1997). 

13. Kawasaki E (General Scanning Instruments, Inc., 
Wstertown. MA), unpublished data. 

14. Flowers P. Overnock J. Mact ML Jr. Pegliughi FM, 
Eggers WJE. Vonters H. Honkantn P. Montagu J. Rom 
SO. Development and Performance of a Novel 
Mtcroarraying System Basad on Surface Tension 
Forces. Available: http://www.geneticmicro.com/ 
resouires/rttml/coWsprmg.html [cited 22 March 19991 

' IS. Butow R (university ef Texas Medical Center. Datlai. DO. 
Unpublished data. 



Research Update 



TRENDS in Biochemical Sciences Vol.27 Noj6 June 2002 



Reference 8 of 11 

of Response dated 12/04/03 

InUSSN 09/937,060 



I Protein Sequence Motif 

Diacylglyceride kinases, sphingosine kinases and NAD 
kinas s: distant r latives of 6-phosphofructokinases 

Gilles Labesse, Dominique Douguet, Liliane Assairi and Anne-Marie Gilles 



Fig. 1. Reactions catalyzed by DGKs, SKs, PFKs, PPNKs and possibly Y036.SYNY3. The phosphate group transferred 
to the product is circled. PP, indicates the polyphosphate molecules of various length used by PFPs; ATP is the 
preferred phosphate donor for the other kinases. Long acyl chains are shown as zigzag lines in the structure of the 
substrate of SK and DGK. The putative function of Y036_SYNY3 was deduced by similarity and its putative 
substrate specificity was predicted from domain-swapping analysis. Abbreviations: DGKs, diacylglyceride kinases; 
PFKs. 6-phosphofructokinases; PFP(p), pyrophosphate-dependant phosphofructokinase (P subunit); 
PPNKs, polyphosphate/ATP NAD kinases; SKs, sphingosine kinases. 




Diacylglyceride kinases, sphingosine 
' kinases, NAD kinases and 

6-phosphofructokinases are thought to be 
related despite large evolution of their 
sequ nces. Discovery of a common 
signature has led to the suggestion that 
they possess a similar phosphate-donor- 
binding site and a similar phosphorylation 
mechanism. The substrate- and allosteric- 
binding sites are much more divergent and 
their delineation remains to be determined 
xperi mentally. 

Despite their importance in key metabolic 
and signaling pathways, diacylglyceride 
kinases (DGKs) , sphingosine 
kinases (SKs) and polyphosphate/ATP 
NAD kinases (PPNKs) are poorly 
characterized enzymes. At present, 
there are no structural data available for 
these kinases. By contrast, the 
6-phosphofructokinases (PFKs) are 
well-characterized enzymes that have no 
protein relatives other than the 
pyrophosphate-dependent 
phosphofructokinases (PFPs). Extending 
sequence comparisons of all these kinases 
to structural analysis, threading and 
sequence motif searches, suggested that 
they are weakly related, both with respect 
to structure and function. These 
observations explain some of the 
experimental data discussed in the 
following text (e.g. results from directed 
mutagenesis) , and also suggest a function 
for a related hypothetical protein 
(SWISSPROT: Y036.SYNY3). 

PFKs (EC 2.7. 1 . 1 1 ; Pfam PF00365) 
phosphorylate D-fructose 6-phosphate 
(Fig. 1) and regulate glycolytic flux. The 
crystal structure of this allosteric enzyme 
was solved in complex with its substrate 
and a phosphate donor [ 1 ] . PFK is 
-300 amino acids (aa) in length, and 
comprises two similar (oc/P) lobes: one 
involved in ATP binding and the other 
housing both the substrate-binding site 
and the allosteric site (a regulatory 
binding site distinct from the active site, 
but that affects enzyme activity) . Both 
PFKs and the related PFPs [2] 



(EC 2.7. 1 .90) are dimeric or tetrameric. 
In the crystal structure [1], the phosphate 
donor (ADP) binds a specific sequence 
motif, GGdGs (upper and lower case 
letters refer to strictly and loosely 
conserved amino acids, respectively) , 



which contains the aspartate involved in 
Mg 2+ chelation. 

The phosphorylation of NAD was 
recently shown to be catalyzed by a 
member of the PPNK family 
[polyphosphate/ATP-NAD kinase, 
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SX »uto- family t 

JPREDSPH2 hbbbbb aaaaaaaaa aaaa bbbbbb bbbbb bbbbbb 

SPH2_HUMAN H 39 > - LLPRPPRLLX.LVWPFGGRGLAWOWCKWHVLPMI SEAGLS FHL IOTER- 111) -SLSEWDGIVT VSGDGL LHEVLMGLLDRP- t 07 ) - 

DOS »ub- family t 

JPREDY036 bbbbbb aaflaaaaaaaaa bbb bbbbbb aaaaaaaa 

Y036_SYNY3 (120> -LU3KTKTGHLIFNPVAGQGNVEREIJ3LIKEHLQSEINIJCITPTSAEV- (20) -DGBGDSPIIASGCDGTVSGVAAALVNTG- (00) - 

BMRU_BACSU ( 0 ) MSHRKALLIHNGNAANKNIEKALGAWPVLSHHLXIEVI IKQTKKK- ( 10 ) --DDSVDTVFILGGDGTIHQCINAILERK- (00) - 

KDGZ_HUMAN (475 > - PTPSPKPLLVFVNPKSGGNQGAKIIQSFLWYLNP--RQVFDLSQGGP- ( 00) NLRILAfifiSESIVGWILSTLDQLR- (04) - 

JPREDKDG2 bbbbb aaaaaaaoaaaa bbbb bbbbbb a a aaaaoaa a aa 

PFHX «ub- family t 

JPREDUTRl bbbbbb bbaaaaaaaaaaaoo bbbbbbaaaa bbbbbbb bbbaoaaaa 

UTRl_YEAST (122) - VELDVENLMI VTKLNDVS L YFLTRELVEWVLVH F - -RVTVYVDSELK- (31) - HDVF FD LVVTLGGDGTVLFVSS I FQRHV- (00)- 

PPNK_ECOLI (0) -MNNHFKCIGIVGHPRHPTALTTHEMLYRWLCTKG YEVIVEQQIA- ( 17 ) - - -QLADLAVWGGDGKMLGAARTLARYD- (00) - 

PPNK_HELPY (0) -MKDSLQTIGVFVRPTHYQNPLFEKLEQAKEWVL KLLEDEGFB- ( 14 ) -LIEKADAFLCLGGDGTILGALRHTH5YH- (00) - 

JPREDPPNK bbbbbb aaaoaaaaaaaaaaa bbbb bbbbbb aaaaaaaaa 

PFK »ub- family i 

K6PF_ HUMAN ( 14 ) KAIAVLTSGGDAQGMNAAVRAWRVGIFTGARVFPVHEGYQG- (48) -VKRGITNLCVIGGDGSLTGADTFRSEWS- (22) - 

1PFK~ ( 0 ) MIKKIGVLTSGGDAPGMNAAIRGWRSALTEGLEVMGIYDGYLG- ( 46 ) - KKRG IDALWIGGDGSYMGAMRLTEM3 - - (00)- 

P-SEA bbbbbbbbbbbb aaaaaaaaaaaaaaa bbbbbbb aaa aaa b bbbbb baaaaaaaaaaaaa 



SK sub- family i 

JPREDSPH2 bb aaaa a bbbb bbbbbb bbbbbb bbbb 

SPH2_HUMAN MPVGILPCGSGNALAGA- (88) -GRLSYLPATVE (152) -DFVLMLAISPS- (16) -GLVHLCWVRSGIPS- (36) -LTPRGVLTVDGE- (29) 

DQK aub- family t 

JPREDY036 bbbb aa bbb aaaa bbbbbbbb bbbbbb bbb 

Y036_SYNY3 I PLG 1 1 PRGTANAFS VA - (33) -LLAGVGFEAEM- (44) -EASAITIANAA- (17) -GLLDITVASSQTAL-,(35) -TSPPQKIWDGE- (26) 
BMRU_BACSU PAVGILPGGTSNDFSRV- (33 ) -KFWGIGLIAET- (44) -EAVMLLVMMGQ- (16) - GLLDVL ICRHTNLT - (22) - TDT AKKADMDG E - (24) 
KDGZ_ HUMAN PPVAI LPLGTGNDLART- (58) -NYFSLGFDAHV- (60) -KPQCWFLKIP- (25) -GYLEV1GFTMTSLA- ( 19 ) -TSKAIPVQVDGE- (326) 
JPREDKDGZ bbbb bbb b b aaa bbbbbb bbbbbbb aaaaa bbb 

mot aub-familyt 

JPREDUTRl bbbb bb bbbbb bbbb bbbbb bbbbb 

UTRl.YEAST PPVMSFSLGSLGFLTKF- (55 ) -ILNEVTIDRGP- (19) -QADGLIAATPT- (15) -PTVNAIALTPICPH- (21) -KSRPAWAAFDGK- (126) 
PPNkIeCOLI IKVIGINRGNLGFLTDL- (30) - A INEWLHPGK - (19) -RSDGLI ISTPT- (15) -PSLDAITLVPMFPH- ( 21 ) -RRNDLEISCDSQ- (45) 
PPNK_HELPY KPCFGVRIGNLGFLSAV- (35 ) -AINEIVIAKKK- (19) -KGDGLIIATPL- (15) - ALSQSYILTPLCDF- (21) - AHEDALWIDGQ- (49) 
JPREDPPNK bbbb bb aabbbbbb bbbb bbbbb bbbbb 

VFTL aub- family i 

K6PF_HUMAN LNIVGLVGSIDNDFCGT- (30) -FVLEVMGRHCG- ( 10) -GADWVFIPECP- (19) -GSRLNIIIVAEGAI- (21) - YDTRVTVLGHVQ- (482) 
1PFK FPCIGLPGTIDNDIKGT- (38 ) -SWEVMGRYCG- (10) -GCEFVWPEVE- (16) -GKKHAIVAITEHMC- (14) -RETRATVLGHIQ- (71) 

P-SEA bbbbbb bbbbbb bbbb bb bbbbbbbb bbbbbbb 



T/BS 



Fig. 2. Multiple sequence alignment of DGKs, PFKs, PPNKsand related enzymes in the region of the conserved 
motifs. The alignment was performed manually. Sequence codes are from the SWISSPROT database [4] and the 
corresponding secondary structure predictions, obtained from JPRED2 [17], were renamed accordingly 
(e.g. JPREDY036 for Y036_SYNY3). The secondary structure assignment for the crystal structure PDB1PFK [1] was 
performed using P-SEA (httpJ/bioserv.cbsxnrs.fr/HTIv1L_BIO/frame_sea.html). Predicted a helices and 0 strands are 
denoted by a purple 'a' and a red 'b' respectively. The Asp residue in the conserved motif 'gGdgs' is highlighted with 
a yellow background and the motifs previously defined for the identification of DGKs and SKs are underlined. The 
position of the signature derived using PATTINPROT is shown by asterisks. Following the PATTINPROT [23] 
nomenclature, its sequence reads as: X(40)-{WEKG}-{RWVI}-[AGNTVIMLFYHR]-[ASVILFIv1]-[CVILFYHM]- 
[GAPCSTVILFM]-[GASCVILM]-[SG].G-[EDN]-[GDN]-[STFLIVAEDN]-IAWIMLFYHRN]-X(9,35)-fTWCFM}- 
[PCVILMFKT]-[GASTPINHQKR]-[ASCTVIFLM]-[GASCVIMLFY1-[GASCTPVIF]-[ACTVIMLFRNG]-[GASNHRKPV]- 
[GATDEKRYUMVC]-[GSDTI]-[GASTPVILFNHRH^ 

'X' stands for any amino acid, numbers in parentheses indicate the number of possible repetitions, [] surrounds 
possible substitutions at one position and 0 corresponds to unallowed amino acids. Abbreviations: DGKs, 
d iacy I glyceride kinases; PFKs, 6-phosphofructokinases; PPNKs, polyphosphate/ATP NAD kinases; SKs, sphingosine 
kinases. This multiple sequence alignment (alignment number ALIGN.000335) has been deposited with the 
European Bioinformatics Institute (ftp://ftp.ebi.ac.uk/pub/databases/embl/align/ALIGN_000335.dat). 



EC 2.7.1.23; Fig. 1]. Members of this 
family are 260-320 aa in length and are 
usually dimeric or tetrameric. The 
recently identified PPNK from 
Mycobacterium tuberculosis [3] 
(SWISSPROT [41: PPNK_MYCTU) , and 
orthologs from distinct complete 
genomes, were used for PSI-BLAST 
searches [5]; these revealed weak 
sequence similarities (E -value 
range: 1 .00-0.02) to various PFKs. 

DGKs (EC 2.7.1.107) phosphorylate 
diacylglycerol (Fig. 1), which is a second 
messenger that activates protein kinase C 
and is important in cell regulation [6]. The 
monomelic mammalian isoenzymes 
(550-1 1 70 aa) possess several domains 
[6,7]. For example, the C terminus 
houses the catalytic domain, which is 
regulated by anionic amphiphiles 
(e.g. phosphatidylserine). DGKs also 



possess an original motif (named DAGKc 
in SMART [8] and PF00781 in Pfam [9]) 
that contains the sequence 4 <|x[x|)GGDGT 
(<[> stands for any hydrophobic residue) . 
Eukaryotic DGKs are related to as-yet- 
uncharacterized bacterial homologs, 
including Y036_SYNY3. 

The monomelic SKs (-49 kDa) are 
related to DGKs. They phosphorylate 
sphingosine to form sphingosine 
1 -phosphate (Fig. 1), which acts both 
as an intracellular second messenger 
(e.g. in the inhibition of apoptosis) 
and as a ligand for a family of 
G-protein-coupled receptors [10], SKs are 
regulated by acidic phospholipids 
(e.g. dioleoylphosphatidylserine) [10,11]. 
Several regions are highly conserved 
among SKs, including the motif 
'(|x|Klx|>SGDGi\ Mutation of the second 
glycine in this SK signature (G82D in 



human SK) abolishes the kinase activity 
[ 1 1] , as does mutation of the equivalent 
glycine residue in several DGKs [7,11]. 
SKs and DGKs are novel kinases sharing 
a common -350-aa-long catalytic domain; 
they have no significant similarity to other 
known kinases [7]. 

Using PSI-BLAST with default 
parameters to search the SWISSPROT 
database with Y036_SYNY3 (aa 1-433) as 
a query revealed significant similarities, 
at convergence (14 iterations), between 
the C terminus of the query and: (1) DGKs 
(E- value ranging from e -95 to e -60 ) and 
(2) PPNKs (e- 60 to e" 27 ). The N terminus 
(aa 1-1 16) of the query showed significant 
similarities with bacterial methylglyoxal 
synthases [12]. These enzymes use 
dihydroxyacetone-phosphate (DHA-P) as 
a substrate, thereby suggesting that the 
kinase domain of the query might be 
involved in DHA phosphorylation (Fig. 1). 
Similarities to the PFKs appeared just 
below the default threshold of 0.002. 
These sequence similarities can be 
extended to PFKs by adding a couple of 
PFK sequences (E- value: 0.02-0.50) to 
the inclusion set before resuming the 
PSI-BLAST search. At the second 
convergence, 101 sequences from the 
SWISSPROT database are detected. 
Searches with a slightly higher threshold 
(0.006 and 0.004 instead of 0.002) gave 
the same final results in only one 
convergence (20th and 15th iteration, 
respectively). Shifting to the more 
complete and non-redundant database 
GenPept, and using the matrices PAM70 
(inclusion threshold: 0.002) or 
BLOSUM80 (inclusion threshold: 0.004), 
confirmed the previous results. 
At convergence, after 21 iterations 
(PAM70) , DGKs, SKs, PFKs and PPNKs 
showed significant sequence similarities 
(e -52 to e -6 ) , which extended over a 
common region of -300 aa. 

The fold compatibility between these 
enzymes was further analyzed using 
3D-PSSM [13], FUGUE [14], 
GenTHREADER [15], PDB-BLAST 
(http:^ioinformati(^.burnham-inst.org/ 
pdb.blast/), SAM-T99 [16] and J-PRED2 
[ 1 7] through our meta-server [18]. With 
most queries (including those in Fig. 2), 
PDB-BLAST and SAM-T99, used with 
default parameters, showed weak but 
significant similarities (E-values <0.01) to 
PFK. Fold recognition results were 
further analyzed for single-domain 
protein sequences to avoid any noise 
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resulting from possible domain-domain 
interfaces and improper delimitation 
of the domains. The best scores 
(corresponding to >95% certainty) 
were obtained using the sequence 
PPNK.ECOLI (3D-PSSM, E-value: 
8.I6e-° 2 and FUGUE, Z-score: 5.27). The 
other observed hits corresponded mostly 
to related (cc/P) folds. 

Analysis of the resulting multiple 
alignment deduced from the PSI-BLAST 
search showed that DGKs, PPNKs and 
PFKs align in the same common region, 
forming a short, well-conserved motif 
(Fig. 2). A PHI-BLAST 2. 1 .2 search [19] 
using the various queries and the short 
seed pattern [GS]-G-[ED]-G-[ST] also 
revealed significant similarities (-0.004) 
to the Pfam profile of the PFKs 
(PF00365). 

A PATTINPROT search [20] was 
performed starting with the motif 
4 <t>d<txt><t)<tKl>gGdgs* to refine the signature of 
the common region. The pattern was 
extended to a unique and specific 
signature that is common to these four 
previously unrelated kinase subfamilies 
(finding 230 sequences of DGKs, SKs, 
PPNKs and PFKs in the non-redundant 
database and 87 sequences out of 101 in 
SWISSPROT, the latter set containing 
partial sequences and inactive PFP 
a subunits). This new motif showed a 
highly significant E-value of 2.7e -11 . This 
signature encompassed both the ATP- and 
substrate-binding sites of the crystal 
structure of PFK [ 1 ] , and also comprised 
the two surrounding hydrophobic strands 
(highlighted by asterisks in Fig. 2). This 
signature also contained the previously 
described short motifs specific to the 
DGKs and SKs (underlined in Fig. 2) , and 
explained the results of directed 
mutagenesis obtained on DGKs [7,2 1 ,22] 
and SKs [11]. Similarly, directed 
mutagenesis of the common motif also 
inactivated PPNK (L. Assairi and 
A-M. Gilles, unpublished). Its 
conservation suggested that these 
kinases might possess a similar 
ATP-binding site and might catalyze the 
phosphorylation using a common and 
specific mechanism. 

These results suggest that these 
kinases would belong to the same 
superfamily and might adopt the PFK fold 
despite the very low overall sequence 
identity (10-20% over -250 aa; Fig. 2) [23]. 
The deduced alignment should help us 
design new experiments to characterize 



these kinases; for example, to precisely 
delineate their specific substrate-binding 
site. Directed mutagenesis is currently 
undertaken on PPNKs to define the 
NAD-recognition motif. 
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Proteomics: a major new 
technology for the drug 
discovery process 
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Proteomics is a new enabling technology that is being 
integrated into the drug discovery process. This will 
facilitate the systematic analysis of proteins across any 
biological system or disease, forwarding new targets 
and information on mode of action, toxicology and sur- 
rogate markers. Proteomics is highly complementary to 
genomic approaches in the drug discovery process and, 
for the first time, offers scientists the ability to integrate 
information from the genome, expressed mRNAs, their 
respective proteins and subcellular localization. It is ex- 
pected that this will lead to important new insights into 
disease mechanisms and improved drug discovery 
strategies to produce novel therapeutics. 

Among the major pharmaceutical and biotechnol- 
ogy companies, it is clearly recognized that the 
business of modern drug discovery is a highly 
competitive process. All of the many steps in- 
volved are inherently complex, and each can involve a 
high risk of attrition. The players in this business strive 
continuously to optimize and streamline the process; each 
seeking to gain an advantage at every step by attempting 
to make informed decisions at the earliest stage possible. 
The desired outcome is to accelerate as many key activities 
in the drug discovery process as possible. This should pro- 



duce a new generation of robust drugs that offer a high 
probability of success and reach the clinic and market 
ahead of the competition. 

There has been noticeable emphasis over recent years 
for companies to aggressively review and refine their 
strategies to discover new drugs. Central to this has been 
the introduction and implementation of cutting-edge 
technologies. Most, if not all, companies have now inte- 
grated key technology platforms that incorporate gen- 
omics, mRNA expression analysis, relational databases, 
high-throughput robotics, combinatorial chemistry and 
powerful bioinformatics. Although it is still early days to 
quantify the real impact of these platforms in clinical and 
commercial terms, expectations are high, and it is widely 
accepted that significant benefits will be forthcoming. This 
is largely based on data obtained during preclinical studies 
where the genomic 1,2 and microarray 3,4 technologies have 
already proved their value. 

However, there are several noteworthy outcomes that re- 
sult from this. Many comments are voiced that scientists 
armed with these technologies are now commonly faced 
with data overload. Thus, in some instances, rather than 
facilitating the decision process, the accumulation of more 
complex data points, many with unknown consequences, 
can seem to hinder the process. Also, most drug compa- 
nies have simultaneously incorporated very similar compo- 
nents of the new technology platforms, the consequence 
being that it is becoming difficult yet again to determine 
where a clear competitive advantage will arise. Finally, in 
recent years, largely as a result of the accessibility of the 
technologies, there has been an overwhelming emphasis 
placed on genomic and mRNA data rather than on protein 
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Figure 1. Steps involved in analysing a biological sample by proteomics. MCI molecular cluster index. 



analysis. It is important to remember that proteins dictate 
biological phenotype - whether it is normal or diseased - 
and are the direct targets for most drugs. 

Pr teomics: new technology for 
the analysis of proteins 

It is now timely to recognize that complementary technol- 
ogy in the form of high-throughput analysis of the total 
protein repertoire of chosen biological samples, namely 
proteomics, is poised to add a new and important dimen- 
sion to drug discovery. In a similar fashion to genomics, 
which aims to profile every gene expressed in a cell, pro- 
teomics seeks to profile every protein that is expressed 5-7 . 
However, there is added information, since proteomics can 
also be used to identify the post-translational modifications 
of proteins 8 , which can have profound effects on bio- 
logical function, and their cellular localization. Importantly, 
proteomics is a technology that integrates the significant 
advances in two-dimensional (2D) electrophoretic separa- 
tion of proteins, mass spectrometry and bioinformatics. 
With these advances it is now possible to consistently de- 
rive proteomes that are highly reproducible and suitable 
for interrogation using advanced bioinformatic tools. 

There are many variations whereby different laboratories 
operate proteomics. For the purpose of this review, the 



process used at Oxford GlycoSciences (OGS), which uses 
an industrial-scale operation that is integral to its drug dis- 
covery work, will be described. The individual steps of 
this process, where up to 1000 2D gels can be run and 
analysed per week, are summarized in Fig. 1 . The incom- 
ing samples are bar coded and all information relevant to 
the sample is logged into a Laboratory Information 
Management System (LIMS) database. There can be a wide 
range in the type of samples processed, as applicable to 
individual steps in the drug discovery pipeline, and these 
will be mentioned later. The samples are separated accord- 
ing to their charge (pi) in the first dimension, using iso- 
electric focusing, followed by size (MW) using SDS-PAGE 
in the second dimension. Many modifications have been 
made to these steps to improve handling, throughput and 
reproducibility. The separated proteins are then stained 
with fluorescent dyes which are significantly more sensi- 
tive in detection than standard silver methods and have a 
broader dynamic range. The image of the displayed pro- 
teins obtained is referred to as the proteome, and is digi- 
tally scanned into databases using proprietary software 
called ROSETTA™. The images are subsequently curated, 
which begins with the removal of any artefacts, cropping 
and the placement of pI/MW landmarks. The images from 
replicate images are then aligned and matched to one 
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another to generate a synthetic composite image. This is 
an important step, as the proteome is a dynamic situation, 
and it captures the biological variation that occurs, such 
that even orphan proteins are still incorporated into the 
analysis. 

By means of illustration, Fig. 1 shows the process 
whereby proteomes are generated from normal and dis- 
ease samples and how differentially expressed proteins are 
identified. The potential of this type of analysis is tremen- 
dous. For example, from a mammalian cell sample, in ex- 
cess of 2000 proteins can typically be resolved within the 
proteome. The quality of this is shown in Fig. 2, which 
shows representative proteomes from three diverse bio- 
logical sources: human serum, the pathogenic fungus 
Candida albicans and the human hepatoma cell line 
Huh7. 

Us f proteomics to identify 
dis as specific proteins 

In most cases, the drug discovery process is initiated by 
the identification of a novel candidate target - almost al- 
ways a protein - that is believed to be instrumental in the 
disease process. To date, there is a variety of means 
whereby drug targets have been forthcoming. These in- 
clude molecular, cellular and genomic approaches, mostly 
centred upon DNA and mRNA analysis. The gene in ques- 
tion is isolated, and expression and characterization of its 
coded protein product - i.e. the drug target - is invariably 
a secondary event. 

With the proteomic approach, the starting point is at the 
other end of the 'telescope'. Here there is direct and im- 



mediate comparison of the proteomes from paired normal 
and disease materials. Examples of these pairs are: (1) pu- 
rified epithelial cell populations derived from human 
breast tumours, matched to purified normal populations of 
human breast epithelial cells, and (2) the invading patho- 
genic hyphal form of C. albicans, matched to the non- 
invading yeast form of C. albicans. When the proteome 
images from each pair are aligned, the Proteograph™ soft- 
ware is able to rapidly identify those proteins (each refer- 
enced as having a unique molecular cluster index, or MCI) 
that are either unique, or those that are differentially ex- 
pressed. Thus, the Proteograph output from this analysis is 
both qualitative and quantitative. 

Proteograph analysis for a particular study can also be 
undertaken on any number of samples. For example, one 
might compare anything from a few to several hundred 
preparations or samples, each from a normal and disease 
counterpart, and have these analysed in a single 
Proteograph study. In this way, it is possible to assign 
strong statistical confidence to the data and in some in- 
stances to identify specific subpopulations within the input 
biological sources. This feature will become increasingly 
significant in the near future, and there is a clear synergy 
here whereby proteomics can work closely with pharma- 
cogenomic approaches to stratify patient populations and 
achieve effective targeted care for the patient. Whatever 
the source of the materials, the net output of Proteograph 
analysis is immediate identification of disease specific pro- 
teins. This is shown in Fig. 3, which shows the results of 
a proteograph obtained by comparing untreated human 
hepatoma cells with cells following exposure to a clinical 




Figure 2. Representative proteomes obtained from (a) human serum, (b) the pathogenic fungus Candida albicans 
and (c) the human hepatoma cell line Huh 7. 
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Backgrounds: Huh7 cells untreated 

■MMMIIH Upregulated in Huh7 cells treated with 5FU 

with respect to untreated Huh7 cells 
■■H^H Downregulated in Huh7 cells treated with 5FU 

with respect to untreated Huh7 cells 



MCI 


Fold change 


Pi 


MW 






16.3 




5.60 


13718 






15.8 




8.73 


13224 






12.3 




8.94 


13117 


4052 




11.7 




8.96 


21434 


4064 




11.1 




9.37 


29389 


3792, 




10.9 




8.75 


41162 


3976 




10.5 




7.14 


12061 


3672 




10.5 




8.83 


24121 


.■<W*j. ; : 

3986 




10.4 




4.61 


10853 


3759 




9.9 




4.98 


41420 


4032 




9.7 




4.56 


13117 


1972 




-9.4 




7.92 


12396 


3587 




9.4 




4.50 


14954 


2033 




-9.4 




7.93 


11126 


3984 




8.9 




5.32 


11090 


2403 




-8.8 




6.04 


25950 


3748 




8.8 




6.32 


35513 


2105 




-8.7 




4.76 


20803 


3897 




8.7 




4.94 


87842 


4221 




8.6 




5.78 


71963 



Figure 3. Table of differential protein expression 
profiles, referred to as a Rosetta Proteograph™, 
between Huh 7 cells with and without the cytotoxic 
agent 5-FU. Bars are quantized and do not represent 
exact fold change values. 



cytotoxic agent. In this instance, only the top 20 differen- 
tially expressed MCIs are shown, but the readout would 
normally extend to a defined cut-off value, typically a two- 
fold or greater difference in expression levels, determined 
by the user. 

In a typical analysis involving disease and normal mam- 
malian material, in which each proteome would have 
-2000 protein features each assigned an MCI, the proteo- 
graph might identify somewhere in the region of 50-300 
MCIs that are unique or differentially expressed. To capi- 
talize rapidly on these data, at OGS a high-throughput 



mass spectrometry facility coupled to advanced databases 
to annotate these MCIs as individual proteins is applied. As 
these are all disease specific proteins, each could represent 
a novel target and/or a novel disease marker. The process 
becomes even more powerful when a panel of features, 
rather than individual features, are assigned. The relevance 
of this is apparent when one considers that most diseases, 
if not all, are multifactorial in nature and arise from poly- 
genic changes. Rather than analysing events in isolation, 
the ability to examine hundreds or thousands of events 
simultaneously, as shown by proteomics, can offer real 
advantages. 

Identification and assignment of candidate targets 
The rapid identification and assignment of candidate tar- 
gets and markers represents a huge challenge, but this has 
been greatly facilitated by combining the recent advances 
made in proteomics and analytical mass spectrometry 9 . 
Using automated procedures it is now possible to annotate 
proteins present in femtomole quantities, which would de- 
pict the low abundance class of proteins. The process of 
annotation is similarly aided by the quality and richness of 
the sequence specific databases that are currently avail- 
able, both in the public domain and in the private sector 
(e.g. those supplied by Incyte Pharmaceuticals). In this re- 
spect, the advances in proteomics have benefited consider- 
ably from the breakthroughs achieved with genomics. 

From an application perspective, cancer studies provide a 
good opportunity whereby proteomics can be instrumental 
in identifying disease specific proteins, because it is often 
feasible to obtain normal and diseased tissue from the same 
patient. For example, proteomic studies have been re- 
ported on neuroblastomas 10 , human breast proteins from 
normal and tumour sources 11 " 13 , lung tumours 14 , colon tu- 
mours 15 and bladder tumours 16 . There are also proteomic 
studies reported within the cardiovascular therapeutic area, 
in which disease or response proteins are identified 17 * 18 . 

Genomic microarray analysis can similarly identify 
unique species or clusters of mRNAs that are disease spe- 
cific. However, in some instances, there is a clear lack of 
correlation between the levels of a specific mRNA and its 
corresponding protein (Ref. 19, Gypi, S.P. et al. 7 submit- 
ted). This has now been noted by many investigators and 
reaffirms that post-transcriptional events, including protein 
stability, protein modification (such as phosphorylation, 
glycosylation, acylation and methylation) and cell localiz- 
ation, can constitute major regulatory steps. Proteomic 
analysis captures all of these steps and can therefore pro- 
vide unique and valuable information independent from, 
or complementary to, genomic data. 
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Prot mics for targ t validati n and signal transduc- 
ti n studi s 

The identification of disease specific proteins alone is in- 
sufficient to begin a drug screening process. It is critical to 
assign function and validation to these proteins by con- 
firming they are indeed pivotal in the disease process. 
These studies need to encompass both gain- and loss-of- 
function analyses. This would determine whether the activity 
of a candidate target (an enzyme, for example), eliminated 
by molecular/cellular techniques, could reverse a disease 
phenotype. If this happened, then the investigator would 
have increased confidence that a small-molecule inhibitor 
against the target would also have a similar effect. The 
proposal of candidate drug targets is often not a difficult 
process, but validating them is another matter. Validation 
represents a major bottleneck where the wrong decision 
can have serious consequences 20 . 

Proteomics can be used to evaluate the role of a chosen 
target protein in signal transduction cascades directly rel- 
evant to the disease. In this manner, valuable information 
is forthcoming on the signalling pathways that are per- 
turbed by a target protein and how they might be cor- 
rected by appropriate therapeutics. Techniques that are 
well established in one-dimensional protein studies to in- 
vestigate signalling pathways, such as western blotting 
and immunoprecipitation, are highly suited to proteomic 
applications. For example, the proteomes obtained can be 
blotted onto membranes and probed with antibodies 
against the target protein or related signalling mol- 
ecules 21-23 . Because proteomics can resolve >2000 pro- 
teins on a single gel, it is possible to derive important 
information on specific isoforms (such as glycosylated or 
phosphorylated variants) of signalling molecules. This will 
result in characterization of how they are altered in the 
disease process. Western immunoblotting techniques 
using high-affinity antibodies will typically identify pro- 
teins present at -10 copies per cell (-1.7 fmol); this is in 
contrast to the best fluorescent dyes currently available 
that are limited to imaging proteins at 1000 or more 
copies per cell. The level of sensitivity derived by these 
applications will greatly facilitate interpretation of com- 
plex signalling pathways and contribute significantly to 
validation of the target under study. 

Immunoprecipitation studies 

Similarly, immunoprecipitation studies are another useful 
way to exploit the resolving power of proteomics 24 - 25 . In 
this instance, very large quantities of protein (e.g. several 
milligrams) can be subjected to incubation with antibodies 
against chosen signalling molecules. This allows high-affin- 




ity capture of these proteins, which can subsequently be 
eluted and electrophoresed on a 2D gel to provide a high- 
resolution proteome of a specific subset of proteins. 
Detection by blot analysis allows the identification of ex- 
tremely small amounts of defined signalling molecules. 
Again, the different isoforms of even very low abundance 
proteins can be seen, and, very importantly, the technique 
allows the investigator to identify multiprotein complexes 
or other proteins that co-precipitate with the target protein. 
These coassociating proteins frequently represent sig- 
nalling partners for the target protein, and their identifi- 
cation by mass spectrometry can lead to invaluable infor- 
mation on the signalling processes involved. 

The depth of signal transduction analysis offered by 
proteomics, and the utility for target validation studies, 
can be extended even further by applying cell fraction- 
ation studies 26 " 28 . By purifying subcellular fractions, such 
as membrane, nuclear, organelle and cytosolic, it is possi- 
ble to assign a localization to proteins of interest and to 
follow their trafficking in a cell. Enrichment of these frac- 
tions will also allow much higher representation of low 
abundance proteins on the proteome. Their detection by 
fluorescent dyes or immunoblot techniques will lead to 
the identification of proteins in the range of 1-10 copies 
per cell, putting the sensitivity on a par with genomic 
approaches. 

These signal transduction analyses can be of additional 
value in experiments where inhibitors derived from a 
screening programme against the target are being evalu- 
ated for their potency and selectivity. The inhibitors can 
encompass small molecules, antisense nucleic acid con- 
structs, dominant-negative proteins, or neutralizing anti- 
bodies microinjected into cells. In each case, proteome 
analysis can provide unique data in support of validation 
studies for a chosen candidate drug target. 

Proteomics and drug mode-of-action studies 

Once a validated target is committed to a screening regi- 
men to identify and advance a lead molecule, it is impor- 
tant to confirm that the efficacy of the inhibitor is through 
the expected mechanism. Such mode-of-action studies are 
usually tackled by various cell biological and biochemical 
methods. Proteomics can also be usefully applied to these 
studies and this is illustrated below by describing data ob- 
tained with OGT719. This is a novel galactosyl derivative of 
the cytotoxic agent 5-fluorouracil (5-FU), which is currently 
being developed by OGS for the treatment of hepatocel- 
lular carcinoma and colorectal metastases localized 
in the liver. The premise underpinning the design and ra- 
tionale of OGT719 was to derive a 5-FU prodrug capable 
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Figure 4. Features that are specifically up- or downregulated in Huh 7 cells by either 5-fluorouracil (5-FU) or 
OGT719: (a) elongation factor la2, (h) novel (three peptides by MS-MS) and (c) a-subunit of prolyl-4-hydroxylase. 
Arrows indicate up- or dotvnregulated. 



of targeting, and being retained in, cells bearing the asialo 
glycoprotein receptor (ASGP-r), including hepatocytes 29 , 
hepatoma Huh7 cells 30 and some colorectal tumour cells 31 . 
The growth of the human hepatoma cell line Huh7 is in- 
hibited by 5-FU or by OGT719. If the inhibition by 
OGT719 were the result of uptake and conversion to 5-FU 
as the active component, then it would be expected that 
Huh7 cells would show similar proteome profiles follow- 
ing exposure to either drug. 

To examine these possibilities, we conducted an experi- 
ment taking samples of Huh7 cells that had been treated 
with IC^ doses of either OGT719 or 5-FU. Total cell lysates 
were prepared and taken through 2D electrophoresis, 
fluorescence staining, digital imaging and Proteograph 
analysis. To facilitate the interpretation of the data across 
all of the 2291 features seen on the proteomes, drug- 
induced protein changes of fivefold or greater, identified 
by the Proteograph, were analysed further. Interestingly, 
from this analysis 19 identical proteins were changed five- 
fold or more by both drugs, strongly suggesting similarities 
in the mode of action for these two compounds. 

Thus, from very complex data involving >2000 protein 
features, using proteomics it is possible to analyse quanti- 
tatively and qualitatively each protein during its exposure 
to drugs. The biologist is now able to focus a series of fur- 
ther studies specifically on an enriched subset of proteins. 



Figure 4 shows highlighted examples of the selected areas 
of the proteome where some of these identified proteins in 
the above study are altered in response to either or both 
drugs. 

Several of the proteins identified above as being modu- 
lated similarly by 5-FU or OGT719 in Huh7 cells were sub- 
jected to tandem mass-spectrometric analysis for anno- 
tation. Some of these, such as the nuclear ribosomal 
RNA-binding protein 32 , can be placed into pyrimidine 
pathways or related cell cycle/growth biochemical path- 
ways in which 5-FU is known to act. 

To attribute further significance to the proteome mode- 
of-action studies with OGT719, another cell line, the rat 
sarcoma HSN, was used. Growth of these cells is inhibited 
by 5-FU, but they are completely refractory to OGT719; 
notably they lack the ASGP-r, which might explain this 
finding (unpublished). For our proteome studies, HSN 
cells were treated with 5-FU or OGT719 over a time course 
of one, two and four days. At each time point, cells were 
harvested and processed to derive proteomes and 
Proteographs. As before, we purposely focused on those 
proteins that increased or decreased by fivefold or more. 
In this instance, there were no proteins co-modulated by 
the two drugs. This is perhaps to be expected, given that 
the HSN cells are killed by 5-FU and yet are refractory to 
OGT719. 
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Clear potential 

The above is just an example of how proteomics can be 
used to address the mode of action of anticancer drugs. 
The potential of this approach is clear, and one can envis- 
age situations where it will be profitable to compare the 
proteomes of cells in which the drug target has been elimi- 
nated by molecular knockout techniques, or with small- 
molecule inhibitors believed to act specifically on the same 
target. In addition to using proteomics to examine the ac- 
tion of drugs, it is also possible to use this approach to 
gauge the extent of nonspecific effects that might eventu- 
ally lead to toxicity. For instance, in the example used 
above with HSN cells treated with OGT719, although cell 
growth was not affected, the levels of several specific pro- 
teins were changed. Further investigation of these proteins 
and the signalling pathways in which they are involved 
could be illuminating in predicting the likelihood or other- 
wise of long-term toxicity. 

Us f proteomics in formal drug 
t xicology studies 

A drug discovery programme at the stage where leads 
have been identified and mode-of-action studies are ad- 
vanced, will proceed to investigate the pharmacokinetic 
and toxicology profile of those agents. These two param- 
eters are of major importance in the drug discovery 
process, and many agents that have looked highly promis- 
ing from in vitro studies have subsequently failed because 
of insurmountable pharmacokinetic and/or toxicity prob- 
lems in vivo. Whereas the pharmacokinetic properties of a 
molecule can now be characterized quickly and accu- 
rately, toxicity studies are typically much longer and more 
demanding in their interpretation. 

The ability to achieve fast and accurate predictions of 
toxicity within an in vivo setting would represent a big 
step forward in accelerating any drug discovery pro- 
gramme. Toxicity from a drug can be manifested in any 
organ. However, because the liver and kidney are the 
major sites in the body responsible for metabolism and 
elimination of most drugs, it is informative to examine 
these particular organs in detail to provide early indi- 
cations about events that might result in toxicity. 

The basis for most xenobiotic metabolizing activity is to 
increase the hydrophilicity of the compound and so facili- 
tate its removal from the body. Most drugs are metabo- 
lized in the liver via the cytochrome P450 family of en- 
zymes, which are known to comprise a total of ~200 
different members 33 ' 34 , encompassing a wide array of 
overlapping specificities for different substrates. In addi- 
tion to clearance, they also play a major role in metabo- 




lism that can lead to the production and removal of toxic 
species, and in some instances it is possible to correlate 
the ability or failure to remove such a toxin with a specific 
P450 or subgroup. 

Unique P450 profiles 

Each individual person will have a slightly different P450 
profile, largely from polymorphisms and changes in ex- 
pression levels, although other genetic and environmental 
factors aside from P450 also need to be taken into consid- 
eration. A significant amount of research is currently 
being directed towards this field - known as pharmacoge- 
nomics - with the aim of predicting how a patient will re- 
spond to a drug, as determined by their genetic make- 
up 35-37 . The marked variation of individuals in their ability 
to clear a compound can be one of the key factors in de- 
ciding the overall pharmacokinetic profile of a drug. Not 
only will this have a bearing on the likelihood of a patient 
responding to a treatment, but it will also be a factor in 
determining the possibility of their experiencing an ad- 
verse effect. 

Many pharmaceutical companies are already employing 
genomic approaches, involving P450 measurements, as a 
key step in their assessment of the toxicological profile of 
a candidate drug and therefore of its suitability, or other- 
wise, to be considered for human clinical trials. There are 
limits to this approach, however. Whereas the P450 mRNA 
profiling can predict with some accuracy the likely meta- 
bolic fate of a drug, it will not provide information on 
whether the metabolites would subsequently lead to tox- 
icity. Besides the patient-to-patient differences in steady- 
state levels of the P450s, there are also characteristic induc- 
tion responses of these enzymes to some drugs. Moreover, 
as there can be some doubt over the correlation of mRNA 
levels and the corresponding protein levels, there is scope 
for misinterpretation of the results and hence real advan- 
tages to be gained from a proteome approach. In both in- 
stances, the ability to examine entire proteome profiles, in- 
cluding the P450 proteins, will be a significant advantage 
in understanding and predicting the metabolism and 
toxicological outcome of drugs. 

In addition to direct organ and tissue studies, the serum, 
which collects the majority of toxicity markers released 
from susceptible organs and tissues throughout the entire 
body, can be utilized. Serum is rich in nuclease activity 
and, as pharmacogenomics is not suited to deal with these 
samples, valuable markers of toxicity could go undetected. 
However, by using proteomics for these types of analyses, 
serum markers (and clusters thereof) are now accessible 
for evaluation as indicators of toxicity. 



DDT Vol. 4, No. 2 February 1999 



61 



Pharmacoproteomics 

Proteomics can thus be used to add a new sphere of 
analysis to the study of toxicity at the protein level, and in 
the era of '-omics' there is a case to be made to adopt the 
term Tharmacoproteomics™'. Animals can be dosed with 
increasing levels of an experimental drug over time, and 
serum samples can be drawn for consecutive proteome 
analyses. Using this procedure, it should be possible to 
identify individual markers, or clusters thereof, that are 
dose related and correlate with the emergence and severity 
of toxicity. Markers might appear in the serum at a defined 
drug dose and time that are predictive of early toxicity 
within certain organs and if allowed to continue will have 
damaging consequences. These serum markers could sub- 
sequently be used to predict the response of each individ- 
ual and allow tailoring of therapy whereby optimal effi- 
cacy is achieved without adverse side effects being 
apparent. This application can obviously extend to track- 
ing toxicity of drugs in clinical trials where serum can be 
readily drawn and analysed. Surrogate markers for drug ef- 
ficacy could also be detected by this procedure and could 
facilitate the challenge of identifying patient classes who 
will respond favourably to a drug and at what dosage. 

Conclusions 

By contrast to the agents administered to patients in clini- 
cal wards, the process of drug discovery is not a prescrip- 
tive series of steps. The risks are high and there are long 
timelines to be endured before it is known whether a can- 
didate drug will succeed or fail. At each step of the drug 
discovery process there is often scope for flexibility in in- 
terpretation, which over many steps is cumulative. The 
pharmaceutical companies most likely to succeed in this 
environment are those that are able to make informed 
accurate decisions within an accelerated process. 

The genomics revolution has impacted very positively 
upon these issues and now has a powerful new partner in 
proteomics. The ability to undertake global analysis of pro- 
teins from a very wide diversity of biological systems and 
to interrogate these in a high-throughput, systematic man- 
ner will add a significant new dimension to drug discov- 
ery. Each step of the process from target discovery to clini- 
cal trials is accessible to proteomics, often providing 
unique sets of data. Using the combination of genomics 
and proteomics, scientists can now see every dimension of 
their biological focus, from genes, mRNA, proteins and 
their subcellular localization. This will greatly assist our 
understanding of the fundamental mechanistic basis of 
human disease and allow new improved and speedier 
drug discovery strategies to be implemented. 
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ABSTRACT 



A method for determining the ambient concentrations of a 
plurality of analytes in a liquid sample of volume V liters, 
comprises 

loading a plurality of different binding agents, each being 
capable of reversibly binding an analyte which is or 
may be present in the liquid sample and is specific for 
that analyte as compared to the other components of the 
liquid sample, onto a support means at a plurality of 
spaced apart locations such that each location has not 
more than 0.1 V/K, preferably less than 0.01 V/K, 
moles of a single binding agent, where K liters/mole is 
the equilibrium constant of the binding agent for the 
analyte; 

contacting the loaded support means with the liquid 
sample to be analyzed, such that each of the spaced 
apart locations is contacted in the same operation with 
the liquid sample, the amount of liquid used in the 
sample being such that only an insignificant proportion 
of any analyte present in the liquid sample becomes 
bound to the binding agent specific for it, and 

measuring a parameter representative of the fractional 
occupancy by the analytes of the binding agents at the 
spaced apart locations by a competitive or non- 
competitive assay technique using a site-recognition 
reagent for each binding agent capable of recognizing 
either the unfilled binding sites or the filled binding 
sites on the binding agent, said site-recognition reagent 
being labelled with a marker enabling the amount of 
said reagent in the particular location to be measured. 
A device and kit for use in the method are also 
provided. 



17 Claims, 1 Drawing Sheet 
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DETERMINATION OF AMBIENT 
CONCENTRATIONS OF SEVERAL 
ANALYTES 

This application is a continuation-in-part of U.S. patent 
application Ser. No. 07/984,264, filed Dec. 1, 1992, now 
U.S. Pat. No. 5,432,099, which is a continuation of U.S. 
patent application Ser. No. 07/460,878, filed Feb. 2, 1990, 
now abandoned, filed as PCT/GB88/00649, Aug. 5, 1988. 

FIELD OF THE INVENTION 

The present invention relates to the determination of 
ambient analyte concentrations in liquids, for example the 
determination of analytes such as hormones, proteins and 
other naturally occurring or artificially present substances in 
biological liquids such as body fluids. 

BACKGROUND OF THE INVENTION 

I have proposed in International Patent Application 
WO84/01031 to measure the concentration of an analyte in 
a fluid by contacting the fluid with a trace amount of a 
binding agent such as an antibody specific for the analyte in 
the sense that it reversibly binds the analyte but not other 
components of the fluid, determining a quantity representa- 
tive of the proportional occupancy of binding sites on the 
binding agent and estimating from that quantity the analyte 
concentration. In that application I point out that, provided 
that the amount of binding agent is sufficiently low that its 
introduction into the fluid causes no significant diminution 
of the concentration of ambient (unbound) analyte, the 
fractional occupancy of the binding sites on the binding 
agent by the analyte is effectively independent of the abso- 
lute volume of the fluid and of the absolute amount of 
binding agent, i.e. independent within the limits of error 
usually associated with the measurement of fractional occu- 
pancy. In such circumstances, and in these circumstances 
only, the initial concentration [H] of analyte in the fluid is 
related to the fraction (Ab/AbJ of binding sites on the 
binding agent occupied by the analyte by the equation: 

Ab JUtfP 

where K ah (hereinafter referred to as K) is the equilibrium 
constant for the binding of the analyte to the binding sites 
and is a constant for a given analyte and binding agent at any 
one temperature. This constant is generally known as the 
affinity constant, especially when the binding agent is an 
antibody, for example a monoclonal antibody. 

The concept of using only a trace amount of binding agent 
is contrary to generally recommended practice in the field of 
immunoassay and immunometric techniques. For example, 
in such a well-known work as "Methods in Investigative and 
Diagnostic Endocrinology", ed. S. A. Berson and R. S. 
Yalow, 1973 at pages 111-116, it is proposed that in the 
performance of a competitive immunoassay maximum sen- 
sitivity of the assay is achieved if the proportion of the 
"tracer" analyte that is bound approximates to 50%. In order 
to achieve such a high degree of binding of the analyte the 
theory of Berson and Yalow, to this day generally accepted 
by other workers in the field, requires that the concentration 
of binding agent (or, strictly speaking, of binding sites, each 
molecule of binding agent conventionally having one or at 
most two binding sites) must be greater than or equal to the 
reciprocal of the equilibrium constant (K) of the binding 
agent for the analyte, i.e. [Ab]^l/K. For a sample of volume 
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V the total amount of binding agent (or binding sites) must 
therefore be greater than or equal to V/K. A binding agent 
which is a monoclonal antibody may, for example, have an 
equilibrium constant (K) which is of the order of 10 11 

5 liters/mole for the specific antigen to which it binds. Thus, 
under the above generally accepted practice, a binding agent 
(or site) concentration of the order of 10" 1 mole/liter or more 
is required for binding agents of such an equilibrium con- 
stant and, with fluid sample volumes of the order of 1 

3Q milliliter, the use of 10~ 14 or more mole of binding agent (or 
site) is conventionally deemed necessary. Avogadro's num- 
ber is about CxlO 23 so that 10" 34 mole of binding site is 
equivalent to more than l(f molecules of binding agent even 
assuming that the binding agent possesses two binding sites 
per molecule. For specific binding agents of the very highest 

35 affinity K is less than 10 13 liters/mole so that conventional 
practice requires more than 10 7 molecules of binding agent, 
whereas binding agents with lower affinity of the order of 
10 s liters/mole necessitate the use of more than 10 12 mol- 
ecules under conventional practice. In fact all immunoassay 

20 kits marketed commercially at the present time conform to 
these concepts and use an amount of binding site approxi- 
mating to or, more frequently, considerably in excess of 
V/K; indeed in certain types of kit relying on the use of 
labelled antibodies it is conventional to use as much binding 

25 agent as possible, binding proportions of analyte greatly 
exceeding 50%. 

Because of the binding of substantial proportions, for 
example 50%, of the analyte in the liquid samples under test 
in such systems, the fractional occupancy of the binding 

30 sites of the binding agent is not independent of the volume 
of the fluid sample so that for accurate quantitative assays it 
is necessary to control accurately the volume of the sample, 
keeping it constant in all tests, whether of the sample of 
unknown concentration or of the standard samples of known 

35 concentration used to generate the dose response curve. 
Furthermore, such systems also require careful control of the 
amount of binding agent present in the standard and control 
incubation tubes. These limitations of present techniques are 
universally recognised and accepted. 

40 UK Patent Application 2,099,578A discloses a device for 
immunoassays comprising a porous solid support to which 
antigens, or less frequently immunoglobulins, are bound at 
a plurality of spaced apart locations, said device permitting 
a large number of qualitative or quantitative immunoassays 

45 to be performed on the same support, for example to 
establish an antibody profile of a sample of human blood 
serum. However, although the individual locations may be in 
the form of so-called microdots produced by supplying 
droplets of antigen-containing solutions or suspensions, the 

50 number of moles of antigen present at each location is 
apparently still envisaged as being enough to bind essen- 
tially all of the analyte (e.g. antibody) whose concentration 
is to be measured that is present in the liquid sample under 
test. This is apparent from the fact that the quantitative 

55 method used in that application (page 3, lines 21-28) 
involves calibration with known amounts of immunoglobu- 
lin being applied to the support; but this means that, in the 
samples being tested, essentially every molecule must be 
extracted from the sample in order for a true comparison to 

60 be made and hence that large amounts of antigen (i.e. the 
binding agent in this situation) are required in each 
microdot, greatly in excess of the total amount of analyte 
(i.e. antibody in this situation) present in the sample. 

65 SUMMARY OF THE INVENTION 

The present invention involves the realisation that the use 
of high quantities of binding agent is neither necessary for 
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good sensitivity in immunoassays nor is it generally desir- can also be used where appropriate Because the binding 
able. If, instead of being kept as large as possible, the agents are spatially separate from one another it is possible 
amount of binding agent is reduced so that only an insig- io use only a small number of different marker labels or even 
nificant proportion of the analyte is revcrsibly bound to it, the same marker label throughout and to scan each binding 
generally less than 10%, usually less than 5% and for 5 a g Cnt location separately to determine the presence and 
optimum results only 1 or 2% or less, not only is it no longer concentration of the label. By use of the invention consid- 
necessary to use an accurately controlled, constant volume erably more than 3 analyses can be performed with a single 
for all the liquid samples (standard solutions and unknown exposure of the solid support with liquid to be analysed, for 
samples) in a given assay, but it is also possible to obtain example 10, 20, 30, 50 or even up to 100 or several hundreds 
reliable and sometimes even improved estimates of analyte 10 0 f analyses- 
concentration using much less than V/K moles of binding Overall, therefore, the present invention provides a 
agent binding sites, say not more than 0.1 V/K and prefer- metDO d for determining the ambient concentrations of a 
ably less than 0.01 V/K. For a binding agent having an pluraut y of analytes in a liquid sample of volume V liters, 
equilibrium constant (K) for the analyte of the order of 10" comprising- 

liters/mole and samples or approximately 1 ml size this is 35 loading aplurality ofdifrer ent binding agents, each being 

approximately equivalent to not more than 10^ preferably g * ^ fa ^ e which ^ or may ^ 

less than 10 7 f , molecu es of binding age □ a t each location in ^ fa th£ ^ and * ific for ^ ana , QS 

an individual I array, f the value of K is 10 liters/mole :lhc ^ ^ nents of lhe Uquid sample , OQto 

figures are 10* and 110* molecules respectively , and if K is £ £ locations such 

of the order of 10« liters, mole they are 0" and 10" 20 ^ «£ h ^ nQt * orc ^ Q » y/K ^ of a 

molecules respectively. Below 10 molecules of binding b R liters/mole b lhe ^iUbrium 

aeent at a sin&le location the accuracy of the measurement . t f ? f- „ 

a B vui a» « j constant of the bindmg agent for the analyte, 

would become progressively less as the fractional occu- . 7 . * - ■ i 

pancy of the binding agent sites by the analyte would be able contacting the loaded support means with the liquid 

to change only in discrete steps as individual sites become 25 sam P le t0 be analvs <; d s ™ h lhat each of the spaced apart 

occupied or unoccupied, but in principle at least the use of locatlons 15 contacted in the same operation with the liquid 

as low as 10 molecules would be permissible if an estimate sam P le > the amoum j> f Uc l uid used m tbe ^mple bemg such 

with an accuracy of 10% is acceptable. Practical consider- lhat on,v an "Significant proportion of any analyte present 

ations may give rise to a preference for more than 10 4 m the c h f ld sam P le becoraes bound 10 the bindlQ S a & cnl 

molecules. 30 s P €Clfic for ll ' and 

It will be appreciated that the above mentioned GB patent measuring a parameter representative of the fractional 

application 2,099,578A, which for quantitative estimation occupancy by the analytes of the binding agents at the 

relies on large amounts of binding agent and essentially total s P aced a P art locations by a competitive or non-competitive 

sequestration of all analyte, fails to recognise the advance assa y technique using a site -recognition reagent for each 

achieved bv the present invention, which instead relies on a 35 bmdin S a S ent ca P abIe of rec ogmsing either lhe uafiIled 

different analytical principle requiring measurement of the bindin g sitcs or the mied DUldin & s^es on the binding agent, 

fractional occupancy of the binding agent and which thus said site-recognition reagent being labelled with a marker 

requires only a very low proportion of the total analyte enabling the amount of said reagent m the particular location 

molecules present to be sequestered from the sample. 10 ^ measured. 

Following the recognition that the use of such small 40 ^ invention also provides a device for use in detennin- 

amounts of binding agent is permissible, it becomes feasible m S lhe ambient concentrations of a plurality of analytes in 

to place the binding agent required for a single concentration a hqmd sample of volume V liters, comprising a solid 

measurement on a very small area of a solid support and su PP ort means havin S located thereon at a plurality of 

hence to place in juxtaposition to one another but at spatially s P a <*d apart locations a plurality of different binding agents, 

separate points on a single solid support a wide variety of 45 cach binding agent being capable of reversibly binding an 

different binding agents specific for different analytes which anal y tc wnich B or ma y bc present in the liquid sample and 

are or may be present simultaneously in a Uquid to be * specific for that analyte as compared to the other compo- 

analysed. Simultaneous exposure of each of the separate nents of lne ,ia . uld ^mpte, each location having not more 

points to the liquid to be analysed will cause each binding than 01 V ^ preferably less than 0.01 V/K, moles of a 

agent spot to take up the analyte for which it is specific to 50 sm 8 le bindin S a S ent > where K liters/mole is the equilibrium 

an extent (i.e. fractional binding site occupancy) represen- constant of that bindmg agent for reaction with the analyte 

tative of the analyte concentration in the liquid, provided to wnich it is specific. 

only that the volume of solution and the analyte concentra- A kit for use in the method according to the invention 

tion therein are large enough that only an insignificant comprises a device according to the invention, a plurality of 

fraction (generally less than 10%, usually less than 5%) of 55 standard samples containing known concentrations of the 

the analyte is bound to the point. The fractional binding site analytes whose concentrations in the liquid sample are to be 

occupancy for each binding agent can then be determined measured and a set of labelled site-recognition reagents for 

using separate site-recognition reagents which recognise reaction with filled or unfilled binding sites on the binding 

either the unfilled binding sites or filled binding sites of the agents. 

different binding agents and which are labelled with markers 60 In arriving at the method of the invention, 1 have found 

enabling the concentration levels of the separate reagents that, generally speaking, for antibodies having an affinity 

bound to the different binding agents to be measured, for constant K liters/mole for an antigen, the relationship 

example fluorescent markers. Such measurements may be between the antibody concentration and the fractional occu- 

pcrformed consecutively, for example using a laser which pancy of the binding sites at any particular antigen concen- 

scans across the support, or simultaneously, for example 65 tration and the relatioaship between the antibody concen- 

using a photographic plate, depending on the nature of the tration and the percentage of antigen bound to the binding 

labels. Other imaging devices such as a television camera sites at any particular antigen concentration follow the same 
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curves provided that the antibody concentrations and the may inter alia be white or black, such as carbon black, when 

antigen concentrations are each expressed in terms of frac- the signals to be measured from the binding agent or the 

lions or multiples of 1/K. site -recognition reagent are light signals, as from fluorescent 

or luminescent markers. In general, reflective materials are 
BRIEF DESCRIPTION OF THE DRAWING 5 preferred in this case to enhance light collection in the 

detecting instrument or photographic plate. The final choice 

The principle underlying the method of the invention may of optimum mate rial is governed by its ability to attach the 

be better understood by reference to the accompanying binding agent to its surface, its absence of background signal 

drawing which is a graph representing two sets of curves emission and its possession of other properties tending to 

plotting the relationship between antibody concentration and ma ximise the signal/noise ratio for the particular marker or 

the fractional occupancy of the binding sites at certain mar kers attached to the binding agent situated on its surface, 

prescribed antigen concentrations and the relationship Very satisfactory results have been obtained in the Examples 

between antibody concentration and the percentage of anti- described below by the use of a white opaque polystyrene 

gen bound to the binding sites at the same prescribed antigen microtitre plate commercially available from Dynatech 

concentrations. Each curve relates to the antibody concen- uoder the trade name While Microfluor microtitre wells, 
tration [Ab], expressed in terms of 1/K, plotted along the ^ ^ ageQts ^ may be bmding ageQts of dif . 

x-axis. For the set of curves which remain constant or fefent specificitV) that is to sav ; gents which are speciiic to 

decline with increasing [Ab], the y-axis represents the different ana lytes, or two or more of them may be binding 

fractional occupancy (F) of binding sites on the antibody by a of me same sp ecificity bm of different affinity, that is 

the antigen; for the second set, the y-axis represents the t0 agen js which are specific to the same analyte but have 

percentage (be) of antigen bound to those binding sites. The differeot equilibrium ^slants K for reaction with It. The 

individual curves in each set represent the relationships ^ aUernative fc particularly useful where the concentre- 

corresponding to four different antigen concentrations [An] ^ of analyle tQ ^ iQ the unkBOWn sample can 

expressed in terms of K, namely 10/K, 1.0/K, O.l/K and ya Qvcr considerable ranges> for example 2 or 3 orders of 

0.01/K. The curves show that as [Ab] falls F reaches an magnitude , as in the case of HCG measurement in urine of 

essentially constant level, the value of which is dependent on pregnaot women> where it ^ vary from 0 .1 to 100 or more 

IU/ml. 

DETAILED DESCRIPTION binding agents used will preferably be antibodies, 

more preferably monoclonal antibodies. Monoclonal anti- 

The choice of a solid support is a matter to be left to the 30 bodies to a wide variety of ingredients of biological fluids 

user. Preferably the support is non-porous so that the binding are commercially available or may be made by known 

agent is disposed on its surface, for example as a monolayer. techniques. The antibodies used may display conventional 

Use of a porous support may cause the binding agent, affinity constants, for example from 10 8 or l(f liters/mole 

depending on its molecular size, to be carried down into the upwards, e.g. of the order of 10 10 or 10 n liters/mole, but 

pores of the support where its exposure to the analyte whose 35 high affinity antibodies with affinity constants of 10 12 -10 13 

concentration is to be determined may likewise be affected liters/mole can also be used. The invention can be used with 

by the geometry of the pores, so that a false reading may be sucn binding agents which are not themselves labelled, 

obtained. Porous supports such as nitrocellulose paper dot- However, it is also possible and frequently desirable to use 

ted with spots of binding agent are therefore less preferred. labelled binding agents so that the system binding agent/ 

Unlike the supports used in GB 2,099,5 78A, which seem to 40 a nalyte/site-recognition reagent includes two different labels 

need to be porous because of the large number of molecules D f t ne same type, e.g. fluorescent, chemiluminescent, 

to be attached, the supports for use in the present invention enzyme or radioisotopic, one on the binding agent and one 

use much smaller quantities and therefore need not be on the site-recognition reagent. The measuring operation 

porous. The non-porous supports may, for example be of then measures the ratio of the intensity of the two signals and 

plastics material or glass, and any convenient rigid plastics 45 thus eliminates the need to place the same amount of 

material may be used. Polystyrene is a preferred plastics labelled binding agent on the support when measuring 

material, although other polyolefins or acrylic or vinyl signals from standard samples for calibration purposes as 

polymers could likewise be used. when measuring signals from the unknown samples. 

The support means may comprise microbe ads, e.g. of Because the system depends solely on measurement of a 

such a plastics material, which can be coated with uniform 50 ratio representative of binding site occupancy, there is also 

layers of binding agent and retained in specified locations, no need to measure the signal from the entire spot but 

e.g. hollows, on a support plate. Alternatively the material scanning only a portion is sufficient. Each binding agent is 

may be in the form of a sheet or plate which is spotted with preferably labelled with the same label but different labels 

an array of dots of binding agent. It can be advantageous for can be used. 

the configuration of the support means to be such that liquid 55 The binding agents may be applied to the support in any 
samples of approximately the volume V liters are readily 0 f the ways known or conventionally used for coating 
retained in contact with the plurality of spaced apart loca- binding agents onto supports such as tubes, for example by 
tions marked with the different binding agents. For example, contacting each spaced apart location on the support with a 
the spaced apart locations may be arranged in a well in the solution of the binding agent in the form of a small drop, e.g. 
support means, and a plurality of wells, each provided with go 0.5 microliter, on a 1 mm 2 spot, and allowing them to remain 
the same group of different binding agents in spaced apart i n contact for a period of time before washing the drops 
locations, can be linked together to form a microtitre plate away. A roughly constant small fraction of the binding agent 
for use with a plurality of samples. present in the drop becomes adsorbed onto the support as a 
When the support means is to be used in conjunction with result of this procedure. It is to be noted that the coating 
a measuring system involving light scanning, the material, 65 density of binding agent on the microspot does not need to 
e.g. plastics, for the support is desirably opaque to light, for be less than the coating density in conventional antibody- 
example it may be filled with an opacifying material which coated tubes; the reduction in the number of molecules on 
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each spot may be achieved solely by reduction of the size of one of the binding agent/analyte spots in each group of spots 
the spot rather than the coating density. A high coating or in certain circumstances, as with glycoprotein hormones 
density is generally desirable to maximise signal/noise such as HCG and FSH which have a common binding site, 
ratios. The sizes of the spots arc advantageously less than 10 they may be cross-reacting reagents able to react with 
mm 2 , preferably less than 1 mm 2 . The separation is 5 occupied binding sites in more than one of the spots, 
desirably, but not necessarily, 2 or 3 times the radius of the In the assay technique the signals representative of the 
spot, or more. These suggested geometries can nevertheless fractional occupancy of the binding agent in the test samples 
be changed as required, being subject solely to the limita- of unknown concentrations of the analytescan be calibrated 
tions on the number of binding agent molecules in each spot, by reference to dose response curves obtained from standard 
the minimum volume of the sample to which the array of 10 samples containing known concentrations of the same ana- 
spots will be exposed and the means locally available for lytes. Such standard samples need not contain all the ana- 
con veniently preparing an array of spots in the manner lytes together, provided that each of the analytes is present 
described m some of the standard samples. Fractional occupancy may 
Once the binding agents have been coated onto the be measured by estimating occupied binding sites (as with 
support it is conventional practice to wash the support, in the I5 aa . ""*>°M« " D0CCU P ied ,. bindlD S s " cs (« 
case of antibodies as binding agents, with a solu^on con- ^th an ant.-tdiotyp.c anubody), as one .s the converse of he 
A . . „ • other. For greater accuracy it is desirable to measure the 
taming albumen or other protein to saturate all remaining fraction w * ch fe doser ^ ^ a ch - n frac _ 

non-specific adsorption sites on the support and elsewhere ^ occupancy of 0>01 b proportionately greater in this 

To confirm that the amount of binding agent man individual case> aIlnough for fractional occupancies in the range 

spot will be less than the maximum amount (0.1 V/K) 20 25-75% either alternative is generally satisfactory, 

required to conform to the principle of the present invention, Id thal embodiment of the preseil t invention which relies 

the amount of binding agent present on any individual site on ^ fl uorC scent markers, the measurement of relative 

can be checked by labelling the binding agent with a intensity of the signals from the two markers, one on the 

detectable marker of known specific activity (i.e. known binding agent and the other on the site recognition reagent, 

amount of marker per unit weight of binding agent) and t5 mav be carried out by a laser scanning confocal microscope 

measuring the amount of marker present. Thus, if the use of such as a Bio-Rad Lasersharp MRC 500, available from 

labelled binder is not desired on the solid support used in the Bio-Rad Laboratories Ltd., and having a dual channel detec- 

method of the invention the binding agent can nevertheless tion system. This instrument relies on a laser beam to scan 

be labelled in a trial experiment and identical conditions to the dots or the like on the support to cause fluorescence of 

those found in that trial to give rise to correct loadings of 30 the markers and wavelength filters to distinguish and mea- 

binding agent can be used to apply unlabelled binding agent sure the amounts of fluorescence emitted. Time-resolved 

to the supports to be actually used. fluorescence methods may also be used. Interference (so- 

The minimum size of the liquid sample (V liters) is called crosstalk) between the two channels can be compen- 

correlated with the number of mole of binding agent (less sated for by standard corrections if it occurs or conventional 

than 0.1 V/K) so that only an insignificant proportion of the 35 efforts can be made to reduce it. Discrimination of the two 

analyte present in the liquid sample becomes bound to the fluorescent signals emitted by the dual-labelled spots is 

binding agent. This proportion is as a general rule less than accomplished in the present form of this instrument, by 

10%, usually less than 5% and desirably 1 or 2% or less, filters capable of distinguishing the characteristic wave- 

depending on the accuracy desired for the assay (greater length of the two fluorescent emissioas; however, fluorcs- 

accuracy being obtained, other things being equal, when 40 cent substances may be distinguished by other physical 

smaller proportions of analyte are bound) and the magnitude characteristics, such as differing fluorescence decay times, 

of other error-introducing factors present. Sample sizes of bleaching times, etc., and any of these means may be used, 

the order of one or a few ml or less, e.g. down to 100 either alone or in combination, to differentiate between two 

microliters or less, are often preferred, but circumstances fluorophores and hence permit measurement of the ratio of 

may arise when larger volumes are more conveniently 45 two fluorescent labelled entities (binding agent and site- 

assayed, and the geometry may be adjusted accordingly. The recognition reagent) present on an individual spot, using 

sample may be used at its natural concentration level or if techniques well known in the fluorescence measurement 

desired it may be diluted to a known extent. field. When only one fluorescent label is present the same 

The site-recognition reagents used in the method accord- techniques may be used, provided that care is taken to scan 

ing to the invention may themselves be antibodies, e.g. 50 tne entire spot in each case and the spots contain essentially 

monoclonal antibodies, and may be anti-idiot ypic or anti- me same amount of binding agent from one assay to the next 

analyte antibodies, the latter recognising occupied sites. WDen lne unknown and standard samples are used. 

Alternatively, for example for analytes of small molecular In the case of other labels, such as radioisotopic labels, 

size such as thyroxine (T4), unoccupied sites may be rec- chemiluminescent labels or enzyme labels, analogous means 

ognised using either the analyte itself, appropriately 55 of distinguishing the individual signals from one or from 

labelled, or the analyte covalently coupled to another each of a pair of such labels are also well known. For 

molecule — e.g. a protein molecule — which is directly or example two radioisotopes such as 325 I and 131 1 may be 

indirectly labelled. The site -recognition reagents may be readily distinguished on the basis of the differing energies of 

labelled directly or indirectly with conventional fluorescent their respective radioactive emissions. Likewise it is pos- 

labels such as fluorescein, rhodamine or Texas Red or 60 sible to identify the products of two enzyme rcactioas, 

materials usable in time- resolved pulsed fluorescence such deriving from dual enzyme-labelled antibody couplets, these 

as europium and other lanthanide chelates, in a conventional being e.g. of different colours, or two chemiluminescent 

manner. Other labels such as chemiluminescent, enzyme or reactions, e.g. of different chemiluminescent lifetime or 

radioisotopic labels may be used if appropriate. Each site- wavelength of light emission; by techniques well known in 

recognition reagent is preferably labelled with the same 65 the respective fields. 

label but different labels can be used in different reagents. The invention may be used for the assaying of analytes 

The site-recognition reagents may be specific for a single present in biological fluids, for example human body fluids 
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sucb as blood, serutn t saliva or urine. They may be used for amounting to about 400 microliters, is added to one of the 

the assaying of a wide variety of hormones, proteins, wells and allowed to incubate for several hours. About 400 

enzymes or other analyles which are either present naturally microliters of various standard solutions containing known 

in the liquid sample or may be present artificially such as concentrations (0.02, 0.2, 2 and 20 ng/ml) of TNF or HCG 

drugs, poisons or the like. 5 are added to other wells of the plate and also allowed to 

For example, the invention may be used to provide a incubate for several hours. The wells are then washed 

device for quantitatively assaying a variety of hormones several times with buffer solution. 

relating to pregnancy and reproduction, such as FSH, LH, As site-recognition reagents there are used for the TNF 

HCG, prolactin and steroid hormones (e.g. progesterone, spots an anti-TNF antibody having an affinity constant for 

estradiol, testosterone and androstenc-dione), or hormones 1° TNF at 25° C. of about lxlO l0 liters/mole and for the HCG 

of the adrenal pituitary axis, such as Cortisol, ACTH and spots an anti-HCG antibody having an affinity constant for 

aldosterone, or thyroid-related hormones, such as T4, T3, HCG at 25° C. of about lxlO 11 liters/mole. Both antibodies 

and TSH and their binding protein TBG, or viruses such as are labelled with fluorescein (FITQ. 400 microliter aliquots 

hepatitis, AIDS or herpes virus, or bacteria, such as of solutions of these labelled antibodies are added to the 

staphylococci, streptococci, pneumococci, gonococci and ^ wells and allowed to stand for a few hours. The wells are 

enterococci, or tumour-related peptides sucb as AFP or then washed with buffer. 

CEA, or drugs such as those banned as illicit improvers of The resulting fluorescence ratio of each spot is quantified 

athletes' performance, or food contaminants. In each case with a Bio-Rad Lasersharp MRC 500 confocal microscope, 

the binding agents used will be specific for the analytes to be From the standard solutions dose response curves for TNF 

assayed (as compared with others in the sample) and may be 20 anc j ^qq are built U p, ihe figures for TNF being as follows: 
monoclonal antibodies therefor. 

Further details on the methodology are to be found in my ~~~ ! a 

ruimw u«an. u „,Soo/ninco .t. * * TNF concentration FTTC fluorescence . 

Internationa Patent Pubhca ion WO88/01058, the contents . . Texas Red fluorescence on™* 5 *** 



of which are incorporated herein by reference. ^. — 

The invention is illustrated by the following Examples. £ ^ \\ 

EXAMPLE 1 20 42* 



An anti-TNF (tumour necrosis factor) antibody having an 
affinity constant for TNF at 25° C. of about lxlO 9 liters/mole 30 and those for HCG being as follows: 
is labelled with Texas Red. A solution of the antibody at a 
concentration of 80 micrograms/ml is formed and 0.5 micro- 



liter aliquots of this solution are added in the form of HCG mncentration ^J^™^ oo HCGspo, 

droplets one to each well of a Dynatech Microfluor (opaque 



white) filled polystyrene microti tre plate having 12 wells. 35 0.02 1.8 

An anti-HCG (human chorionic gonadotropin) antibody ° 2 16^0 

having an affinity constant for HCG at 25° C. of about 6xl0 8 2 o 28.2 
liters/mole is also labelled with Texas Red. A solution of the 



After addition of the droplets the plate is left for a few 
hours in a humid atmosphere to prevent evaporation of the 
droplets. During this time some of the antibody molecules in 



antibody at a concentration of 80 micrograms/ml is formed ,„ . „ , , . . r 

« 7r' * ; , ftU . . " «u 40 The artificially produced solution was found to give ratio 

and 0.5 micro iter aliquots of this solution are added in the « u r - n .u. twc ™t on jiAc „„ ,u c i?rr 

, * 11 t t . y»» • readings of :>.9 on the TNF spot and 10.5 on the HCG spot, 

form of drop ets one to each well of the same Dynatech iwuiug* ui j.7 w ^ .^^J 

uiui ui uiu F . w j correlating well with the actual concentrations of TNF (0.5 

Microfluor micro Utrc^plate. ^ ^ ^ ^ ^ ^ ^ HCQ (Q 5 ng/m]) ^ ^ dose 

response curves. 

45 EXAMPLE 3 

the droplets become adsorbed onto the plate. Next, the wells Using similar procedures to those outlined in Example 1 

are washed several times with a phosphate buffer and then a microtilre p t atc containing spots of labelled anti-T4 

they are filled with about 400 microliters of a 1% albumen (thyroxine) antibody (affinity constant about 1x10" liters/ 

solution and left for several hours to saturate the residual 5o mole at 25 <> c ^ i abel i ed a nti-TSH (thyroid stimulating 

binding sites in the wells. Thereafter they are washed again hormone) antibody (affinity constant about 5x10? liters/ 

with phosphate buffer. mole at 25 0 C.) and labelled anti-T3 (triiodothyronine) 

The resulting plate has in each of its wells two spots each antibody (affinity constant about lxlO 11 liters/mole at 25° 

of area approximately 1 mm 2 . Measurement of the amount c.) in each of the individual wells is produced, the spots 

of fluorescence shows that in each well one spot contains 55 containing less than 1x10" 12 V moles of anli-T4 antibody or 

about 5x1 0 9 molecules of anti-TNF antibody and the other less than 2xl0~ 13 V moles of anti-TSH antibody or less than 

contaias about 5x1 0 9 molecules of anti-HCG antibody. The lxl0~ J2 V moles of anti-T3 antibody, 

wells are designed for use with liquid samples of volume The developing antibody (site- recognition reagent) for the 

400 microliters, so that 0.1 V/K is 4xl0~ 14 moles TSH assay is an anti-TSH antibody with an affinity constant 

(equivalent to lAxU) 10 molecules) for the anti-TNF anti- 60 for TSI i 0 f 2xlO j0 liters/mole at 25° C. This antibody is 

body and 7xl0" ld moles (equivalent to 4xl0 10 molecules) labelled with fluorescein (FITC). The site-recognition 

for the anti-HCG antibody. reagents for the T4 and T3 assays are T4 and T3 coupled to 

poly-lysine and labelled with FITC, and they recognise the 
EXAMPLE 2 unfilled sites on their respective first antibodies. 
A microtitre plate prepared as described in Example 1 is 65 Using 400 microliter aliquots of standard solutions con- 
used in an assay for an artificially produced solution con- taining various known amounts of T4, T3 and TSH, dose 
taining TNF and HCG. A test sample of the solution, response curves are obtained by methods analogous to those 
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in Example 2, correlating fluorescence ratios with T4, T3 
and TSH concentrations. The plate is used to measure T4, T3 
and TSH levels in serum from human patients with good 
correlation with the results obtained by other methods. 

EXAMPLE 4 

Using similar procedures to those outlined in Example 1 
a microtitre plate containing spots of first labelled anti-HCG 
antibody (affinity constant about 6x10 s liters/mole at 25° 
C), second labelled anti-HCG antibody (affinity constant 
about 1.3x10" litcrs/mole at 25° C) and labelled anti-FSH 
(follicle stimulating hormone) antibody (affinity constant 
about 1.3x10 s liters/mole at 25° C.) in each of the individual 
wells is produced, the spots each containing less than 0.1 
V/K moles of the respective antibody. A cross- re acting 
(alpha subunit) monoclonal antibody 8D10 with an affinity 
constant of 1x10" liters/mole is used as a common devel- 
oping antibody for both the HCG and the FSH assays. 

Using 400 microliter aliquots of standard solutions con- 
taining various known concentrations of HCG and FSH, 
dose response curves are obtained by methods analogous to 
those in Example 2, correlating fluorescence ratios with 
HCG and FSH concentrations, the curve obtained with the 
higher affinity anti-HCG antibody giving more 
concentration-sensitive results at the lower HCG concentra- 
tions whereas the curve from the lower affinity anti-HCG 
antibody is more concentration-sensitive at the higher HCG 
concentrations. The plate is used to measure HCG and FSH 
concentrations in the urine of women in pregnancy testing, 
giving good correlations with results obtained by, other 
means and achieving effective concentration measurements 
for HCG over a concentration range of two or three orders 
of magnitude by correct choice of the best HCG spot and 
dose response curve. 

Production of Labelled Antibodies 

The labelling of the antibodies with fluorescent labels can 
be carried out by a well known and standard technique, see 
Leslie Hudson and Frank C. Hay, "Practical Immunology, 
Blackwell Scientific Publications (1980), pages 11-13, for 
example as follows: 

The monoclonal antibody anti-FSH 3G3, an FSH specific 
(beta subunit) antibody having an affinity constant (K) of 
1.3xl0 8 liters per mole, was produced in the Middlesex 
Hospital Medical School, and was labelled with TRITC 
(rhod amine isothiocyanate) or Texas Red, giving a red 
fluorescence. 

The monoclonal antibody anti-FSH 8D10, a cross- 
reacting (alpha subunit) antibody having an affinity constant 
(K) of 1x10" liters per mole, was likewise produced in the 
Middlesex Hospital Medical School and was labelled with 
FITC (fluorescein isothiocyanate), giving a yellow-green 
fluorescence. 

The general procedure used involved ascites fluid purifi- 
cation (ammonium sulphate precipitation and T-gel 
chromatography) followed by labelling, according to the 
following steps: 

l.a. Ammonium sulphate purification 

1. Add 4.1 ml saturated ammonium sulphate solution to 5 
ml antibody preparation (culture supernatant or 1:5 
diluted ascites fluid) under constant stirring (45% 
saturation). 

2. Continue stirring for 30-90 min. Centrifuge at 2500 
rpm for 30 min. 

3. Discard the supernatant -and dissolve the precipitate in 
PBS (final volume 5 ml.). Repeat Steps 1 and 2, OR 
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4. Add 3.6 ml saturated ammonium sulphate (40% 
saturation) under constant stirring. Repeat Step 2. 

5. Discard the supernatant and dissolve the pellet in the 
desired buffer. 

6. Dialyse overnight in cold against the same buffer (using 
fresh, boiled-in-d/w dialysis bag). 

7. Determine the protein concentration either at A^ or by 
Lowry estimation. 

Lb. T-gel Chromatography: (Buffer: 1M Tris-Cl, pH 7.6. 
Solid potassium sulphate) 

1. Clear 2 ml of ascites fluid by centrifugation at 4000 
rpm. 

2. Add 1M Tris-Cl solution to achieve final concentration 
of 0.1M. 

3. Add sufficient amount of solid potassium sulphate. 
Final concentration=0.5M. 

4. Apply the ascite fluid to the T-gel column. 

5. Wash the column with 0.1M Tris-Cl buffer containing 
0.5M potassium sulphate, until protein profile (at A 2S0 ) 
returns to zero. 

6. Elute the absorbed protein using 0.1M Tris-Cl buffer as 
the eluant. 

7. Pool the fractions containing antibody activity and 
concentrate using Amicon 30 concentrates 

8. If HPHT purification is to be carried out, use HPHT 
chromatography Starting buffer during Step 7. 

2. Labelling of Antibodies FITC/TRITC conjugation 

1. Dialyse the purified 1 g protein into 0.25M Carbonate - 
bicarbonate buffer, pH 9.0 to a concentration of 20 
mg/ml. 

2. Add FITC/TRITC to achieve a 1:20 ratio with protein 
(i.e. 0.05 mg for every 1 mg of protein). 

3. Mix and incubate at 4° C. for 16-18 hrs. 

4. Separate the conjugated protein from unconjugated by: 

a. Sephadex G-25 chromatography for FITC label, 

or 

b. DEAE-Sephacel chromatography for TRITC/FITC 
label. 

Buffer system: 
PBS for (a). 

0.005M Phosphate, pH 8.0 and 0.18M 
Phosphate, pH 8.0 for (b). 

Calculation of FITC: Protein coupling ratio: - 

2.87 x P.P. 495 nm 

O.D. 280 nm - 0.35 x OD. 495 am 



EXAMPLE 4 

Regents 

1 TSH standards from the National Institute for Biological 
Standards and Control 

2 TSH-free Serum for making up TSH standards 

3 ,25 Mabelled TSH 

4 Anti-TSH monoclonal antibodies from The Scottish 
Antibody Production Unit 

5 Phosphate buffer, 0.1M, pH 7.4 

6 Tris-HCl buffer, 0.05 M, pH 7.6, containing 0.5% bovine 
serum albumin (BSA), 0.05% Tween 20 and 0.1% 
sodium azide 

7 Wash buffer: Phosphate buffer, 0.1M,pH 7.4, containing 
0.1% Tween 20 and 0.1% sodium azide 
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8 Black microtitre strips from Dynatech 

9 SuperBlock from Pierce 

A. Protocol and Conditions for the Radioimmunoassay of 
Thyroid Stimulating Hormone (TSH) 

1. An aliquot of 50 /fl of 50 /*g/ml anti-TSH monoclonal 5 
antibody in phosphate buffer was added to microtitre 
wells and incubated for 1 hour at room temperature. 

2. The microtitre wells were washed with phosphate 
buffer, blocked with SuperBlock for 30 minutes at 
room temperature and then washed again. 

3. An aliquot of 100 id of TSH standards made up in 
TSH-free serum (to yield final concentrations of 0, 
lx' 9 , 2xl0" 9 , 4x10"*, 8xl(T 9 , 12xl0~ 9 , 16x1CT 9 and 
20x10 M/L) or unknown serum samples and 100 fx\ of 15 
125 I-labelled TSH in Tris-HCl assay buffer were added 5 
to triplicate anti-TSH monoclonal antibody coated 
microtitre wells, shaken for 1 hour at room 
temperature, washed with wash buffer and counted for 
radioactivity. The concentration of TSH in the 2Q 
unknown samples can be read from the standard curve. 

The incubation period of 1 hour for the assay is far less 
than the time required for the binding reaction to go to 
equilibrium, but, provided the standards are measured under 
the same conditions, the unknown sample can be measured 25 
against those standards. The effective affinity constant for 
the antibody will of course be that which pertains after 1 
hour incubation and under the same conditions as the assay 
itself. 

B. Procedure for Obtaining the Affinity Constant K of the 3Q 
Anti-TSH Monoclonal Antibody Used in a Radioimmunoas- 
say Performed Under the Conditions Described in (A) 

1. An aliquot of 50 Al of 50 /*g/ml anti-TSH monoclonal 
antibody in phosphate buffer was added to microtitre 
wells and incubated for 1 hour at room temperature. 35 

2. The microtitre wells were washed with phosphate 
buffer, blocked with SuperBlock for 30 minutes at 
room temperature and then washed again. 

3. An aliquot of 100 /d of TSH standards made up in 
TSH-free serum (to yield final concentrations of 0, 40 
IxlO" 9 , 2xl0" 9 , 4xl0" g , 8xl0" 9 , 12xl0~ 9 , 16xl0~ 9 
and 20xl0" 9 M/L) and 100 fd of 123 Mabelled TSH in 
Tris-HCI assay buffer were added to triplicate antibody 
coated microtitre wells, shaken for 1 hour at room 
temperature, washed with wash buffer and counted for 45 
radioactivity. 

4. A standard Scatchard plot of Bound/Free vs. Bound 
TSH was used to obtain the affinity constant K for the 
monoclonal anti-TSH antibody. 

C. A TSH Assay Using an Amount of Capture Antibody 50 
S0.1 V/K and Deposited on the Solid -Phase as Microspots 

Since the assay volume V is 0.2 ml or 2xl0~ 4 L and the 
affinity constant K of the anti-TSH capture antibody used 
under conditions described in (B) was found to be 1.1 xlO 8 
L/M, therefore the maximum amount of capture antibody 55 
allowed in the assay under ambient analyte condition 



-0.1 V/K 

- (0.1 x 2 x 10-*)/l.l x 10 8 M 
« 1.8 x 10" 13 M 



Or a capture antibody concentration of 9xl0 3 ° M/L. 
Assay Protocol: 

1. A 0.5 u\ droplet of a monoclonal anti- TSH capture 65 
antibody in phosphate buffer and at a concentration of 
200 ^g/ml was added to each microtitre well and 
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aspirated instantly. This procedure resulted in antibody 
microspots with a coated area of approximately 10 6 



Molar amouol of coated antibody od microspot 

= (coaled area x antibody density)/ Avogadro Numbei - 

(30 4 x I0°y(6.01 x lO^M 
- 1.7 x HT 14 M 



or a capture antibody concentration of 0.85x10" 10 M/L. 

2. The microtitre wells were washed with phosphate 
buffer and the unreacted sites blocked with SuperBlock 
for 30 minutes at room temperature and then washed 
again with phosphate buffer. 

3. 100 fd of TSH standards (made up in TSH-free serum) 
or unknown samples plus 100 fd of Tris-HCl assay 
buffer were added to triplicate microtitre wells, shaken 
for 1 hour at room temperature and washed with wash 
buffer. 

4. The TSH bound sites were back- tit rated using fluores- 
cent labelled anti-TSH developing monoclonal anti- 
body raised against a different site on the TSH molecule 
and complementary to the capture antibody deposited 
as microspot on the solid-phase. An aliquot of 200 #1 of 
the developing antibody in Tris-HCl assay buffer was 
added to the microtitre wells, shaken for 1 hour at room 
temperature, washed with wash buffer, scanned with a 
BioRad laser scanning confocal microscope and the 
amount of fluorescence on the microspots and the 
amount of fluorescence on the microspots quantified. 
The concentration of TSH in the unknown samples 
were read from the standard curve. 

Although, for the purpose of illustration, the affinity 
constant of the antibody was measured under the assay 
conditions, in practice, in many cases it may not be neces- 
sary actually to perform such a measurement, so long as it 
is obvious, having regard to the details of the assay in 
question, that the amount of capture antibody used on any 
spot is going to be less than 0.1 V/K. 

What is claimed is: 

1. A method for determining the ambient concentration of 
an analyte of interest among a plurality of analytes in a 
liquid sample of volume V liters, comprising: 
loading a plurality of different binding agents, each being 
labelled with a marker and being capable of reversibly 
binding an analyte which is or may be present in the 
liquid sample and is specific for said analyte as com- 
pared to the other components of the liquid sample, 
onto a support means at a plurality of spaced apart 
small spots such that not more than 0.1 V/K moles of 
binding agent are present on any spot, where K liters/ 
mole is the affinity constant of said binding agent for 
said analyte; 

contacting the loaded support means with the liquid 
sample to be analyzed, such that each of the spots is 
contacted in the same step with said liquid sample, the 
amount of liquid used in said sample being such that 
only an insignificant proportion of any analyte present 
in said liquid sample becomes bound to said binding 
agent specific for said analyte; . 

contacting the support with a site-recognition reagent 
specific for each binding agent in a competitive or 
non-competitive technique, the site-recognition reagent 
being capable of recognizing cither the unfilled binding 
sites or the filled binding sites on said binding agent, 
said site-recognition reagent being labelled with a 
marker different from the marker on said binding agent, 
and 
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measuring a ratio of signals from said markers on tbe site 
recognition reagent and the binding reagent from at 
least a part of the spot, from which the analyte to 
interest is determined. 

2. A method according to claim 1, wherein the markers on 
the site-recognition reagent and the binding reagent are 
fluorescent markers. 

3. A method according to claim 2, wherein the ratio of 
signals is measured using a laser scanning confocal micro- 
scope. 

4. A method for determining the fractional binding site 
occupancy of a plurality of binding agents by a plurality of 
analytes in a liquid sample of V liters, comprising: 

(a) loading a plurality of different binding agents, each 
being capable of reversibly binding an analyte which is 
or may be present in the liquid sample and is specific 
for said analyte as compared to the other components of 
the liquid sample, onto a support at a plurality of spaced 
apart small spots such that each spot has a high coating 
density of one of said binding agents but not more than 
0.1 V/K moles of binding agent are present on any one 
spot, where K liters/mole is the affinity constant of said 
binding agent for said analyte; 

(b) contacting the loaded support with the liquid sample 
to be analyzed, such that each of the spots is contacted 
in the same step with said liquid sample, the amount of 
liquid used in said sample being such that only an 
insignificant proportion of any analyte present in said 
liquid sample becomes bound to said binding agent 
specific for said analyte; and 

(c) thereafter contacting the loaded support with site- 
recognition reagents which recognize either the unfilled 
binding sites or filled binding sites of that binding 
agent, the site -recognition reagents being labelled with 
markers from which the fractional binding site occu- 
pancy for each binding agent is determined. 

5. The method of claim 4, wherein the site- recognition 
reagents are labelled with fluorescent markers. 

6. The method of claim 4, wherein the presence of the 
site -re cognition reagents on each respective binding agent is 
determined consecutively. 

7. The method of claim 4, wherein the presence of the 
site -recognition reagents on each respective binding agent is 
determined simultaneously. 

8. The method of claim 4, further comprising, after step 
(c), calculating the concentration level of each reagent using 
the determined value of tbe fractional binding site occu- 
pancy. 

9. A method for detecting a plurality of analytes in a liquid 
sample of volume V liters, comprising: 
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loading a plurality of different binding agents, each being 
capable of reversibly binding an analyte which is or 
may be present in the liquid sample and is specific for 
said analyte as compared to the other components of 
the liquid sample, onto a support means at a plurality of 
spaced apart small spots such that each spot has a high 
coating density of one of said binding agents but not 
more than 0,1 V/K moles of binding agent are present 
on any spot, where K liters/moles is the affinity con- 
stant of said binding agent for said analyte; 

contacting the loaded support means with the liquid 
sample to be analyzed, such that each of the spots is 
contacted in the same step with said liquid sample, the 
amount of liquid used in said sample being such that 
only an insignificant proportion of any analyte present 
in said liquid sample becomes bound to said binding 
agent specific for said analyte; 

contacting the support with a site-recognition reagent 
specific for each binding agent in a competitive or 
non-competitive technique, the site-recognition reagent 
being capable of recognizing either the unfilled binding 
sites or the filled binding sites on said binding agent, 
said site-recognition reagent being labelled with a 
marker; and 

measuring the signal from the marker of the site- 
recognition reagent in a particular location to detect the 
presence of said plurality of analytes in said sample. 

10. A method as claimed in claim 9, wherein each of said 
spots has a size of less than 1 mm 2 . 

11. A method as claimed in claim 10, wherein each of said 
spots contains more than 10 4 molecules of binding agent. 

12. A method as claimed in claim 11, wherein each of said 
spots has less than 0.01 V/K moles of binding agent. 

13. A method as claimed in claim 11, wherein said binding 
agents used have affinity constants for said analytes of from 
10 8 to 10 13 liters per mole. 

14. A method as claimed in claim 11, wherein said binding 
agents used have affinity constants for said analytes of the 
order of 10 10 to 10 11 liters per mole. 

15. A method as claimed in claim 11, wherein the volume 
of said liquid sample is not more than 0.1 liter. 

16. A method as claimed in claim 11, wherein the volume 
of said liquid sample is 400 to 1000 microliters. 

17. A method as claimed in claim 9, wherein said binding 
agents loaded onto said support means are antibodies for the 
analytes whose concentrations are to be determined. 
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ABSTRACT 



A method for determining the ambient concentrations 
of a plurality of analytes in a liquid sample of volume V 
liters, comprises 

loading a plurality of different binding agents, each 
being capable of binding specifically and reversibly 
an analy te of interest onto a support means at a plural- 
ity of spaced apart locations such that not more than 
0. 1 V/K moles of each binding agent are present at 
any location, where k liters/mole is the equilibrium 
constant of each such binding agent; 

contracting the loaded support means with the sample 
to be analyzed, such that each of the spaced apart 
locations is contacted in the same operation with the 
sample, the amount of sample liquid being such that 
only an insignificant proportion of any analyte pres- 
ent in the sample becomes bound to the binding agent 
specific for it, and 

measuring a parameter representative of the fractional 
occupancy by the analytes of the binding agents at 
the spaced apart locations by a competitive or non- 
competitive assay technique, using a labelled site- 
recognition reagent for each binding agent capable of 
recognizing either the unfilled binding sites or the 
filled binding sites on the binding agent, which ena- 
bles the amount of said reagent in the particular loca- 
tion to be measured. A device and kit for use in the 
method are also provided. 

17 Claims, 1 Drawing Sheet 
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which is a monoclonal antibody may, for example, have 

DETERMINATION OF AMBIENT an equilibrium constant (K) which is of the order of 

CONCENTATION OF SEVERAL ANALYTES 10" liters/mole for the specific antigen to which it 

binds. Thus, under the above generally accepted prac- 

This is a continuation of co-pending application Ser. 5 tice, a binding agent (or site) concentration of the order 

No. 07/460,878, filed as PCT/GB88/00649, Aug. 5, of 10- 11 mole/liter or more is required for binding 

1988. agents of such an equilibrium constant and, with fluid 

FIELD OF THE INVENTION sample volumes of the order of 1 milliliter, the use of 

, J . . . 10- 14 or more mole of binding agent (or site) is conven- 

The present invention relates to the determination of 10 tionall dee med necessary. Avogadro's number is about 

ambient analyte concentrations in liquids, for example 6 x l0 23 ^ that 10- " mole of binding site is equivalent 

the determination of analytes such as hormones, prote- than 10 9 mo lecules of binding agent even as- 

ins and other naturaHy occurring or artificially present possesses two binding 

substances in biological liquids such as body fluids. ^ ^ agents of the 

BACKGROUND OF THE INVENTION 15 very highest affinity K is less than 10 13 liters/mole so 

I have proposed in International Patent Application that conventional practice requires more than 10? mole- 

WO84/01031 to measure the concentration of an ana- <=ules of binding agent^ whereas finding agents with 

lyte in a fluid by contacting the fluid with a trace lower affinity of the order of 10* hters/mole necessitate 

amount of a binding agent such as an antibody specific „ the use of more than 1012 molecules under conventional 

for the analyte in the sense that it reversibly binds the M practice. In fact all immunoassay kits marketed coro- 

analyte but not other components of the fluid, determin- mercially at the present time conform to these concepts 

ing a quantity representative of the proportional occu- and use an amount of binding site approximating to or, 

pancy of binding sites on the binding agent and estimat- more frequently, considerably in excess of V/K; indeed 

ing from that quantity the analyte concentration. In that in certain types of kit reiving on the use of labelled 

application I point out that, provided that the amount of antibodies it is conventional to use as much binding 

binding agent is sufficiently low that its introduction agent as possible, binding proportions of analyte greatly 

into the fluid causes no significant diminution of the exceeding 50%. 

concentration of ambient (unbound) analyte, the frac- Because of the binding of substantial proportions, for 

tional occupancy of the binding sites on the binding example 50%, of the analyte in the liquid samples under . 

agent by the analyte is effectively independent of the test in such systems, the fractional occupancy of the 

absolute volume of the fluid and of the absolute amount binding sites of the binding agent is not independent of 

of binding agent, i,e. independent within the limits of th e volume of the fluid sample so that for accurate 

error usually associated with the measurement of frac- quantitative assays it is necessary to control accurately 

tional occupancy. In such circumstances, and in these ^ the volume of the sample, keeping it constant in all tests, 

circumstances only, the initial concentration [H] of whether of the sample of unknown concentration or of 

analyte in the fluid is related to the fraction (Ab/ Ab 0 ) of ^ stan dard samples of known concentration used to 

binding sites on the binding agent occupied by the ana- generate the dose response curve. Furthermore, such 

lyte by the equation: systems also require careful control of the amount of 

Ab/Abo^KatiHVi+Ko&Hl 40 binding agent present in the standard and control incu- 

bation tubes. These limitations of present techniques are 

where Kcb (hereinafter referred to as K) is the equilib- universally recognised and accepted. 

Hum constant for the bmding of the analyte to the bind- UK patent Application 2,099,578 A discloses a device 

ing sites and is a constant for a given analyte and bind- for imraimoassays comprising a porous solid support to 

ing agent at any one temperature. This constant is gen- which ^ or less frequently iinmunoglobulins, are 

erally known as the affinity consent, especially when « ^ a lurality of spaced apart locations , said de - 

the binding agent is an antibody, for example a mono- ^ ^^^g a large number of qualitative or quanti- 

clonal antibody. tative immunoassays to be performed on the same sup- 

The concept of usmg only a trace amount of bwdmg ^ ^ ffle rf a 

agent .sconttary to generally recommended practice m huma n Wood serum. However, although the 

the field of immunoassay and mmunometnc tech- » P of soiled 

SS U t J° r , m SUC h ™ ^TrtltT mfc^odot produced by supplying droplets of antigen- 

"Methods m Investigative and Diagnostic Endocrinol- v _ . 3 * v J r \. , f n c 

OKTed S. A. Berson and R. S. Yalow, 1973 at pages containing solutions or suspensions, the number of 

nlm it b proposed that in the performance of a «ota of antigen present at each location is apparently 

competitive inLuSay maximum 'sensitivity of the 55 still envisaged as being enough to bind essentially all of 

a^s achieved if the proportion of the "tracer" ana- the analyte (e.g. antibody) whose concenuation is to be 

lyte that is bound approximates to 50%. In order to ^asured that is present m the liquid sample under test, 

thieve such a high degr^ This is apparent from the fact ttat the quantitative 

theory of Berson and Yalow, to this day generally ac- method used in that application (page 3, lines 21-28) 

cepted by other workers in the field, requires that the 60 involves calibration with known amounts of unmiino- 

concentration of binding agent (or, strictly speaking, of globulin being applied to the support; but this means 

binding sites, each molecule of binding agent conven- that, in the samples being tested, essentially every mole- 

tionally having one or at most two binding sites) must cule must be extracted from the sample in order for a 

be greater than or equal to the reciprocal of the equilib- true comparison to be made and hence that large 

rium constant (K) of the binding agent for the analyte, 65 amounts of antigen (i.e. the binding agent in this situa- 

i.e. [ab]> 1/K. For a sample of volume V the total tion) are required in each microdot, greatly in excess of 

amount of binding agent (or binding sites) must there- the total amount of analyte (i.e. antibody in this situa- 

fore be greater than or equal to V/K A binding agent tion) present in the sample. 
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SUMMARY OF THE INVENTION 

The present invention involves the realisation that the 
use of high quantities of binding agent is neither neces- 
sary for good sensitivity in immunoassays nor is it gen- 5 
erally desirable. If, instead of being kept as large as 
possible, the amount of binding agent is reduced so that 
only an insignificant, proportion of the analyte is revers- 
ibly bound to it, generally less than 10%, usually less 
than 5% and for optimum results only 1 or 2% or less, 10 
not only is it no longer necessary to use an accurately 
controlled, constant volume for all the liquid samples 
(standard solutions and unknown samples) in a given 
assay, but it is also possible to obtain reliable and some- 
times even improved estimates of analyte concentration 15 
using much less than V/K moles of binding agent bind- 
ing sites, say not more than 0.1 V/K and preferably less 
than 0.01 V/K. For a binding agent having an equilib- 
rium constant (K) for the analyte of the order of 10 1 1 
liters/mole and samples of approximately 1 ml size this 20 
is approximately equivalent to not more than 10 8 , pref- 
erably less than 10 7 , molecules of binding agent at each 
location in an individual array. If the value of K is 10 13 
liters/mole the figures are 10 6 and 10 5 molecules respec- 
tively, and if K is of the order of 10 8 liters/mole they are 25 
10 11 and 10 10 molecules respectively. Below 10 2 mole- 
cules of binding agent at a single location the accuracy 
of the measurement would become progressively less as 
the fractional occupancy of the binding agent sites by 
the analyte would be able to change only in discrete 30 
steps as individual sites become occupied or unoccu- 
pied, but in principle at least the use of as low as 10 
molecules would be permissible if an estimate with an 
accuracy of 10% is acceptable. Practical considerations 
may give rise to a preference for more than 10 4 mole- 35 
cules. 

It will be appreciated that the abovementioned GB 
patent application 2,099,578A, which for quantitative 
estimation relies on large amounts of binding agent and 
essentially total sequestration of all analyte, fails to 40 
recognise the advance achieved by the present inven- 
tion, which instead relies on a different analytical prin- 
ciple requiring measurement of the fractional occu- 
pancy of the binding agent and which thus requires only 
a very low proportion of the total analyte molecules 45 
present to be sequestered from the sample. 

Following the recognition that the use of such small 
amounts of binding agent is permissible, it becomes 
feasible to place the binding agent required for a single 
concentration measurement on a very small area of a 50 
solid support and hence to place in juxtaposition to one 
another but at spatially separate points on a single solid 
support a wide variety of different binding agents spe- 
cific for different analytes which are or may be present 
simultaneously in a liquid to be analysed. Simultaneous 55 
exposure of each of the separate points to the liquid to 
be analysed will cause each binding agent spot to take 
up the analyte for which it is specific to an extent (i.e. 
fractional binding site occupancy) representative of the 
analyte concentration in the liquid, provided only that 60 
the volume of solution and the analyte concentration 
therein are large enough that only an insignificant frac- 
tion (generally less than 10%, usually less than 5%) of 
the analyte is bound to the point. The fractional binding 
site occupancy for each binding agent can then be deter- 65 
mined using separate site-recognition reagents which 
recognise either the unfilled binding sites or filled bind- 
ing sites of the different binding agents and which are 
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labelled with markers enabling the concentration levels 
of the separate reagents bound to the different binding 
agents to be measured, for example fluorescent markers. 
Such measurements may be performed consecutively, 
for example using a laser which scans across the sup- 
port, or simultaneously, for example using a photo- 
graphic plate, depending on the nature of the labels. 
Other imaging devices such as a television camera can 
also be used where appropriate. Because the binding 
agents are spatially separate from one another it is possi- 
ble to use only a small number of different marker labels 
or even the same marker label throughout and to scan 
each binding agent location separately to determine the 
presence and concentration of the label. By use of the 
invention considerably more than 3 analyses can be 
performed with a single exposure of the solid support 
with liquid to be analysed, for example 10, 20, 30, 50 or 
even up to 100 or several hundreds of analyses. 

Overall, therefore, the present invention provides a 
method for determining the ambient concentrations of a 
plurality of analytes in a liquid sample of volume V 
liters, comprising: 

loading a plurality of different binding agents, each 
being capable of reversibly binding an analyte 
which is or may be present in the liquid and is 
specific for that analyte as compared to the other 
components of the liquid sample, onto a support 
means at a plurality of spaced apart locations such 
that each location has not more than 0.1 V/K 
moles of a single binding agent, where K IitersA 
mole is the equilibrium constant of the binding 
agent for the analyte, 
contacting the loaded support means with the liquid 
sample to be analysed such that each of the spaced 
apart locations is contacted in the same operation 
with the liquid sample, the amount of liquid used in 
the sample being such that only an insignificant 
proportion of any analyte present in the liquid 
sample becomes bound to the binding agent spe- 
cific for it, and 
measuring a parameter representative of the frac- 
tional occupancy by the analytes of the binding 
agents at the spaced apart locations by a competi- 
tive or non-competitive assay technique using a 
site-recognition reagent for each binding agent 
capable of recognising either the unfilled binding 
sites or the filled binding sites on the binding agent, 
said site-recognition reagent being labelled with a 
marker enabling the amount of said reagent in the 
particular location to be measured. 
The invention also provides a device for use in deter- 
mining the ambient concentrations of a plurality of 
analytes in a liquid sample of volume V liters, compris- 
ing a solid support means having located thereon at a 
plurality of spaced apart locations a plurality of differ- 
ent binding agents, each binding agent being capable of 
reversibly binding an analyte which is or may be pres- 
ent in the liquid sample and is specific for that analyte as 
compared to the other components of the liquid sample, 
each location having not more than 0. 1 V/K, preferably 
less than 0.01 V/K, moles of a single binding agent, 
where K liters/mole is the equihbrium constant of that 
binding agent for reaction with the analyte to which it 
is specific. 

A kit for use in the method according to the invention 
comprises a device according to the invention, a plural- 
ity of standard samples containing known concentra- 
tions of the analytes whose concentrations in the liquid 



5,432,099 

5 6 

sample are to be measured and a set of labelled site- means to be such that liquid samples of approximately 

recognition reagents for reaction with filled or unfilled the volume V liters are readily retained in contact with 

binding sites on the binding agents. the plurality of spaced apart locations marked with the 

In arriving at the method of the invention, I have different binding agents, For example, the spaced apart 

found that, generally speaking, for antibodies having an 5 locations may be arranged in a well in the support 

affinity constant K liters/mole for an antigen, the rela- means, and a plurality of wells, each provided with the 

tionship between the antibody concentration and the same group of different binding agents in spaced apart 

fractional occupancy of the binding sites at any particu- locations, can be linked together to form a microliter 

lar antigen concentration and the relationship between plate for use with a plurality of samples, 

the antibody concentration and the percentage of anti- 10 When the support means is to be used in conjunction 

gen bound to the binding sites at any particular antigen with a measuring system involving light scanning, the 

concentration follow the same curves provided that the material, e.g. plastics, for the support is desirably 

antibody concentrations and the antigen concentrations opaque to light, for example it may be filled with an 

are each expressed in terms of fractions or multiples of opacifying material which may inter alia be white or 

1/K. 15 black, such as carbon black, when the signals to be 

BRIEF DESCRIPTION OF THE DRAWINGS — &S££ZS££Z 

The principle underlying the method of the invention cent markers. In general, reflective materials are pre- 
may be better understood by reference to the accompa- ferred in this case to enhance light collection in the 
nying drawing which is a graph representing two sets of 20 detecting instrument or photographic plate. The final 
curves plotting the relationship between antibody con- choice of optimum material is governed by its ability to 
centration and the fractional occupancy of the binding attach the binding agent to its surface, its absence of 
sites at certain prescribed antigen concentrations and background signal emission and its possession of other 
the relationship between antibody concentration and properties tending to maximise the signal/noise ratio for 
the percentage of antigen bound to the binding sites at 25 the particular marker or markers attached to the bind- 
the same prescribed antigen concentrations. Each curve ing agent situated on its surface. Very satisfactory re- 
relates to the antibody concentration [Ab], expressed in suits have been obtained in the Examples described 
terms of 1/K, plotted along the x-axis. For the set of below by the use of a white opaque polystyrene microli- 
curves which remain constant or decline with increas- ter plate commercially available from Dynatech under 
ing [Ab], the y-axis represents the fractional occupancy 30 the trade name White Microfluor microliter wells. 
(F) of binding sites on the antibody by the antigen; for The binding agents used may be binding agents of 
the second set, the y-axis represents the percentage different specificity, that is to say agents which are 
(b%) of antigen bound to those Binding sites. The indi- specific to different analytes, or two or more of them 
vidual curves in each set represent the relationships may be binding agents of the same specificity but of 
corresponding to four different antigen concentrations 35 different affinity, that is to say agents which are specific 
[Ann] expressed in terms of K, namely 10/K, 1.0/K, to the same analyte but have different equiUbrium con- 
O.l/K and 0.0 1/K. The curve show that as [Ab] falls F stants K for reaction with it. The latter alternative is 
reaches an essentially constant level, the value of which particularly useful where the concentration of analyte 
is dependent on [An]. to be assayed in the unknown sample can vary over 

40 considerable ranges, for example 2 or 3 orders of magni- 



DETAILED DESCRIPTION 



tude, as in the case of HCG measurement in urine of 



The choice of a solid support is a matter to be left to pregnant women, where it can vary from 0.1 to 100 or 

the user. Preferably the support is non-porous so that more IU/mL 

the binding agent is disposed on its surface, for example The binding agents used will preferably be antibod- 
as a monolayer. Use of a porous support may cause the 45 ies, more preferably monoclonal antibodies. Mono- 
binding agent, depending on its molecular size, to be clonal antibodies to a wide variety of ingredients of 
carried down into the pores of the support where its biological fluids are commercially available or may be 
exposure to the analyte whose concentration is to be made by known techniques. The antibodies used may 
determined may likewise be affected by the geometry of display conventional affinity constants, for example 
the pores, so that a false reading may be obtained. Po- 50 from lC^or 10 9 liters/mole upwards, e.g. of the order of 
rous supports such as nitrocellulose paper dotted with 10 10 or 10 11 liters/mole, but high affinity antibodies with 
spots of binding agent are therefore less preferred, Un- affinity constants of 10 12 -10 13 liters/mole can also be 
like the supports used in GB 2,099,578A, which seem to used. The invention can be used with such binding 
need to be porous because of the large number of mole- agents which are not themselves labelled. However, it is 
cules to be attached, the supports for use in the present 55 also possible and frequently desirable to use labelled 
invention use much smaller quantities and therefore binding agents so that the system binding agent- 
need not be porous. The non-porous supports may, for . /analyte/site-recognition reagent includes two different 
example be of plastics material or glass, and any conve- labels of the same type, e.g. fluorescent, chemUurnines- 
nient rigid plastics material may be used, Polystyrene is cent, enzyme or radioisotopic, one on the binding agent 
a preferred plastics material, although other polyolefins 60 and one on the site-recognition reagent. The measuring 
or acrylic or vinyl polymers could likewise be used. operation then measures the ratio of the intensity of the 
The support means may comprise microbeads, e.g. of two signals and thus eliminates the need to place the 
such a plastics material, which can be coated with uni- same amount of labelled binding agent on the support 
form layers of binding agent and retained in specified when measuring signals from standard samples for cali- 
locations, e.g. hollows, on a support plate, Alternatively 65 bration purposes as when measuring signals from the 
the material may be in the form of a sheet or plate which unknown samples. Because the system depends solely 
is spotted with an array of dots of binding agent, It can on measurement of a ratio representative of binding site 
be advantageous for the configuration of the support occupancy, there is also no need to measure the signal 
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from the entire spot but scanning only a portion is suffi- The site-recognition reagents used in the method 

cicnt. Each binding agent is preferably labelled with the according to the invention may themselves be antibod- 

same label but different labels can be used. ies, e.g. monoclonal antibodies, and may be anti-idioty- 

The binding agents may be applied to the support in pic or anti-analyte antibodies, the latter recognising 

any of the ways known or conventionally used for coat- 5 occupied sites. Alternatively, for example for analytes 

ing binding agents onto supports such as tubes, for ex- of small molecular size such as thyroxine (T4), unoccu- 

ample by contacting each spaced apart location on the pied sites may be recognised using either the analyte 

support with a solution of the binding agent in the form itself, appropriately labelled, or the analyte covalently 

of a small drop, e.g. 0.5 microliter, on a 1 mm 2 spot, and coupled to another molecule— e.g. a protein 

allowing them to remain in contact for a period of time 10 molecule — which is directly or indirectly labelled. The 

before washing the drops away. A roughly constant site-recognition reagents may be labelled directly or 

small fraction of the binding agent present in the drop indirectly with conventional fluorescent labels such as 

becomes adsorbed onto the support as a result of this fluorescein, rhodainine or Texas Red or materials usable 
procedure. It is to be noted that the coating density of in time-resolved pulsed fluorescence such as europium 

binding agent on the microspot does not need to be less 15 and other lanthanide chelates, in a conventional man- 

than the coating density in conventional antibody- ner. Other labels such as chemiluminescent, enzyme or 

coated tubes; the reduction in the number of molecules radioisotopic labels may be used if appropriate. Each 

on each spot may be achieved solely by reduction of the site-recognition reagent is preferably labelled with the 

size of the spot rather than the coating density. A high same label but different labels can be used in different 

coating density is generally desirable to maximise sig- 20 reagents. The site-recognition reagents may be specific 

nal/noise ratios. The sizes of the spots are advanta- for a single one of the binding agent/analyte spots in 

geously less than 10 mm 2 preferably less than 1 mm 2 . each group of spots or in certain circumstances, as with 

The separation is desirably, but not necessarily, 2 or 3 glycoprotein hormones such as HCG and FSH which 

times the radius of the spot, or more. These suggested have a common binding she, they may be cross-reacting 

geometries can nevertheless be changed as required, 25 reagents able to react with occupied binding sites in 
being subject solely to the limitations on the number of more than one of the spots. 

binding agent molecules in each spot, the minimum In the assay technique the signals representative of 
volume of the sample to which the array of spots will be the fractional occupancy of the binding agent in the test 
exposed and the means locally available for conve- samples of unknown concentrations of the analytes can 
niently preparing an array of spots in the manner de- 30 be calibrated by reference to dose response curves ob- 
scribed. tained from standard samples containing known con- 
Once the binding agents have been coated onto the centrations of the same analytes. Such standard samples 
support it is conventional practice to wash the support, need not contain all the analytes together, provided that 
in the case of antibodies as binding agents, with a solu- each of the analytes is present in some of the standard 
tion containing albumen or other protein to saturate all 35 samples. Fractional occupancy may be measured by 
remaining non-specific adsorption sites on the support estimating occupied binding sites (as with an anti- 
and elsewhere. To confirm that the amount of binding analyte antibody) or unoccupied binding sites (as with 
agent in an individual spot will be less than the maxi- an anti-idiotypic antibody), as one is the converse of the 
mum amount (0.1 V/K) required to conform to the other. For greater accuracy it is desirable to measure 
principle of the present invention, the amount of bind- 40 the fraction' which is closer to zero because a change in 
ing agent present on any individual site can be checked fractional occupancy of 0.01 is proportionately greater 
by labelling the binding agent with a detectable marker in this case, although for fractional occupancies in the 
of known specific activity (i.e. known amount of marker range 25-75% either alternative is generally satisfac- 
per unit weight of binding agent) and measuring the tory. 

amount of marker present. Thus, if the use of labelled 45 In that embodiment of the present invention which 
binder is not desired on the solid support used in the relies on two fluorescent markers, the measurement of 
method of the invention the binding agent can neverthe- relative intensity of the signals from the two markers, 
less be labelled in a trial experiment and identical condi- one on the binding agent and the other on the site recog- 
tions to those found in that trial to give rise to correct nition reagent, may be carried out by a laser scanning 
loadings of binding agent can be used to apply unla- 50 confocal microscope such as a Bio-Rad Lasersharp 
belled binding agent to the supports to be actually used. MRC 500, available from Bio-Rad Laboratories Ltd., 
The minimum size of the liquid sample (V liters) is and having a dual channel detection system. This instru- 
correlated with the number of mole of binding agent ment relies on a laser beam to scan the dots or the like 
(less than 0.1 V/K) so that only an insignificant propor- on the support to cause fluorescence of the markers and 
tion of the analyte present in the liquid sample becomes 55 wavelength filters to distinguish and measure the 
bound to the binding agent. This proportion is as a amounts of fluorescence emitted. Time-resolved fluo- 
general rule less than 10%, usually less than 5% and rescence methods may also be used. Interference (so- 
desirably 1 or 2% or less, depending on the accuracy called crosstalk) between the two channels can be corn- 
desired for the assay (greater accuracy being obtained, pensated for by standard corrections if it occurs or 
other things being equal, when smaller pro portions of 60 conventional efforts can be made to reduce it. Discrimi- 
analyte are bound) and the magnitude of other error- nation of the two fluorescent signals emitted by the 
introducing factors present Sample sizes of the order of dual-labelled spots is accomplished in the present form 
one or a few ml or less, e.g. down to 100 microliters or of this instrument, by filters capable of distmguishing 
less, are often preferred, but circumstances may arise the characteristic wavelength of the two fluorescent 
when larger volumes are more conveniently assayed, 65 emissions; however, fluorescent substances may be dis- 
and the geometry may be adjusted accordingly. The tinguished by other physical characteristics such as 
sample may be used at its natural concentration level or differing fluorescence decay times, bleaching times, 
if desired it may be diluted to a known extent. etc., and any of these means may be used, either alone or 
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in combination, to differentiate between two fluoro- micrograms/ml is formed and 0.5 microliter aliquots of 
phores and hence permit measurement of the ratio of this solution are added in the form of droplets one to 
two fluorescent labelled entities (binding agent and each well of the same Dynatech Microfluor microliter 
site-recognition reagent) present on an individual spot, plate. 

using techniques well known in the fluorescence mea- 5 After addition of the droplets the plate is left for a few 
surement field. When only one fluorescent label is pres- hours in a humid atmosphere to prevent evaporation of 
ent the same techniques may be used, provided that care the droplets. During this time some of the antibody 
is taken to scan the entire spot in each case and the spots molecules in the droplets become adsorbed onto the 
contain essentially the same amount of binding agent plate. Next, the wells are washed several times with a 
from one assay to the next when the unknown and 10 phosphate buffer and then they are filled with about 400 
standard samples are used. microliters of a 1% albumen solution and left for several 

In the case of other labels, such as radioisotopic la- hours to saturate the residual binding sites in the wells, 
bels, chemfluminescent labels or enzyme labels, analo- Thereafter they are washed again with phosphate 
gous means of distinguishing the individual signals from buffer. 

one or from each of a pair of such labels are also well 15 The resulting plate has in each of its wells two spots 
known, For example two radioisotopes such as 125 I and each of area approximately 1 mm 2 . Measurement of the 
131 1 may be readily distinguished on the basis of the amount of fluorescence shows that in each well one spot 
differing energies of their respective radioactive emis- contains about 5x I0 9 molecules of anti-TNF antibody 
sions. Likewise it is possible to identify the products of and the other contains about 5x 1CP molecules of anti- 
two enzyme reactions, deriving from dual enzyme- 20 HCG antibody. The wells are designed for use with 
labelled antibody couplets, these being e.g. of different liquid samples of volume 400 microliters, so that 0.1 
colours, or two cherailuminescent reactions, e.g. of V/K is 4X 10- H moles (equivalent to 2.4 X10 10 mole- 
different cheiniluminescent lifetime or wavelength of cules) for the anti-TNF antibody and 7xl0~ 14 moles 
light emission, by techniques well known in the respec- (equivalent to 4x 10 i0 molecules) for the anti-HCG 
tive fields. 25 antibody. 

The invention may be used for the assaying of ana- 
lytes present in biological fluids, for example human EXAMPLE 2 

body fluids such as blood, serum, saliva or Urine. They A microliter plate prepared as described in Example 
may be used for the assaying of a wide variety of hor- \ ^ m ^ assay for an artificially produced solution 
mones, proteins, enzymes or other analytes which are 30 containing TNF and HCG. A test sample of the solu- 
either present naturally in the liquid sample or may be amounting to about 400 microliters, is added to one 

present artificially such as drugs, poisons or the like. 0 f ^ e we u s ^ allowed to incubate for several hours. 

For example, the invention may be used to provide a About 400 microliters of various standard solutions 
device for quantitatively assaying a variety of hormones containing known concentrations (0.02, 0.2, 2 and 20 
relating to pregnancy and reproduction, such as FSH, 35 n g/ml) of TNF or HCG are added to other wells of the 
LH, HCG, prolactin and steroid hormones (e.g. proges- p j ate ^ a] s0 a]] OW ed to incubate for several hours, 
terone, estradiol, testosterone and androstene-dione), or The wells are then washed several times with buffer 
hormones of the adrenal pituitary axis, such as Cortisol, solution. 

ACTH and aldosterone, or thyroid-related hormones, ^s site-recognition reagents there are used for the 
such as T4, T3, and TSH and their binding protein 40 TNF spots an anti-TNF antibody having an affinity 
TBG, or viruses such as hepatitis, AIDS or herpes vi- constant for TNF at 25" C of about 1 X 10 10 liters/mole 
rus, or bacteria, such, as staphylococci, streptococci, and for the HCG spots an anti-HCG antibody having an 
pneumococci, gonococci and enterococci, or tumour- affinity constant for HCG at 25° C. of about lXlO 11 
related peptides such as AFP or CEA, or drugs such as liters/mole. Both antibodies are labelled with fluores- 
those banned as illicit improvers of athletes' perfor- 45 cein (FTTC). 400 microliter aliquots of solutions of these 
mance, or food contaminants. In each case the binding labelled antibodies are added to the wells and allowed 
agents used will be specific for the analytes to be as- t0 sta nd for a few hours. The wells are then washed 
sayed (as compared with others in the sample) and may ^th buffer. 

be monoclonal antibodies therefor. The resulting fluorescence ratio of each spot is quan- 

Further details on the methodology are to be found in 50 ^fied with a Bio-Rad Lasersharp MRC 500 confocal 

my International Patent Publication W088/01058, the microscope. From the standard solutions dose response 

contents of which are incorporated herein by reference. curves for TNF and HCG are built up, the figures for 
The invention is illustrated by the following. Exam- TNF being as follows; 

pies. 



55 



EXAMPLE 1 TNF concentration FTTC fluorescence 

An anti-TNF (tumour necrosis factor) antibody hav- H^H! Texas Red florescence 

ing an affinity constant for TNF at 25* C. of about 002 lA 

1 x 10? Hters/mole is labelled with Texas Red. A solu- °* * J 

tion of the antibody at a concentration of 80 micro- 60 20 42^5 
grams/ml is formed and 0.5 microliter aliquots of this 

solution are added in the form of droplets one to each J _ ____ , . , „ 

well of a Dynatech Microfluor (opaque white) filled and for ^ follows: 
polystyrene microliter plate having 12 wells. 



on TNF spot 



An anti-HCG (human chorionic gonadotropin) anti- 65 hcg concentration fttc fluorescence 

body having an affinity constant for HCG at 25° C. of ng/ml Texas Red fluorescence °" HQG spot 

about 6x 10 8 Hters/mole is also labelled with Texas 002 Ti 

Red. A solution of the antibody at a concentration of 80 02 7.2 
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-continued ln ^ urine of women in pregnancy testing, giving good 
correlations with results obtained by other means and 



HCG concentration FITC fluorescence ur „ t „, , . . . ^_ 4 . , r 

ng/ m i Texas Red fluorescence on HCG spot achieving effective concentration measurements for 

■ HCG over a concentration range of two or three orders 

20 28^2 5 of magnitude by correct choice of the best HCG spot 

and dose response curve. 
The artificially produced solution was found to give Production of labelled antibodies 
ratio readings of 5.9 on the TOF spot and 10.5 on the The labelling of the antibodies with fluorescent labels 
HCG spot, correlating well with the actual concentra- 1Q can be carried out by a well known and standard tech- 
tions of TNF (0.5 ng/ml) and HCG (0.5 ng/ml) ob- nique, see Leslie Hudson and Frank C. Hay, "Practical 
tained from the dose response curves. Immunology", Blackwell Scientific Publications (1980), 
CY A MPT ^ , pages 11-13, for example as follows: 
fcJtAXVU-Lt i The monoclonal antibody anti-FSH 3G3, an FSH 
Using similar procedures to those outlined in Exam- 15 specific (beta subunit) antibody having an affinity con- 
pie 1 a microliter plate containing spots of labelled anti- stan t (K) of 1.3 X 10 s liters per mole, was produced in 
T4 (thyroxine) antibody (affinity constant about me Middlesex Hospital Medical School, and was la- 
1X10" liters/mole at 25° C ), labelled anti-TSH (thy- ^ triTC (rhodamine isothiocyanate) or 
roid stimukting hormone) antibody (affinity constant Red a red fluorescence . 
about 5X 10? "ters/mole at 25 C) and label ed anfc-T3 2Q mon ^ loi ^ ^ FS H 8D10, a cross- 

wells is produced, the spots containing less than constant (K) of 1x10" liters per mol^ was l^ewise 

1X10-12 V moles of anti-T4 antibody or less than Produced m the Middlesex Hospital Medical School 

2X10- 11 V moles of anti-TSH antibody or less than was labelled with FITC (fluorescein isothiocya- 

1 X 10- 12 v moles of anti-T3 antibody. 25 nate), giving a yellow-green fluorescence. 

The developing antibody (site-recognition reagent) The general procedure used involved ascites fluid 

for the TSH assay is an anti-TSH antibody with an purification (ammonium sulphate precipitation and 

affinity constant for TSH of 2x 10 10 liters/mole at 25' T-gel chromatography) followed by labelling, accord- 

C. This antibody is labelled with fluorescein (FITC). ing to the following steps: 

The site-recognition reagents for the T4 and T3 assays 30 l.a. Ammonium sulphate purification 

are T4 and T3 coupled to poly-lysine and labelled with i. Add 4.1 ml saturated ammonium sulphate solution 

FITC, and they recognise the unfilled sites on their to 5 ml antibody preparation (culture supernatant or 1:5 

respective first antibodies. diluted ascites fluid) under constant stirring (45% satu : 

Using 400 microliter aliquots of standard solutions ration) 

containing various known amounts of T4, T3 and TSH, 35 2 Continue stirring for 3^90 mm. Centrifuge at 2500 

dose response curves are obtained by methods analo- m ^ Qr m j SL 

gous to ftose J? Example 2, correlating fluorescence supernatant and dissolve the precipi- 

ratios with T4, T3 and TSH concentrations. The plate is t A . nT .„ , c . % c 1XT> . e+ , , \, 

used to measure T4, T3 and TSH levels in seruni from J£ m PBS ^ vo,ume 5 ml >- Re P eat ^ 1 ^ 2 > 

human patients with good correlation with the results 40 , , „ „ , , . , , , A 

obtained by other methods. 4 - Add 3 6 ml saturated ammoiuum sulphate (40% 

saturation) under constant stirring. Repeat Step 2. 

EXAMPLE 4 5^ Discard the supernatant and dissolve the pellet in 

Using similar procedures to those outlined in Exam- the desired buffer, 

pie 1 a microliter plate containing spots of first labelled 45 6. Dialyse overnight in cold against the same buffer 

anti-HCG antibody (affinity constant about 6x 10 8 Ii- (using fresh, boiled-in-d/w dialysis bag), 

ters/mole at 25" C), second labelled anti-HCG anti- 7. Determine the protein concentration either at A280 

body (affinity constant about 1.3 XlO 11 liters/mole at or by Lowry estimation. 

25" G) and labelled anti-FSH (follicle stimulating hor- l.b. T-gel Chromatography: (Buffer. 1M Tris-Cl, pH 

mone) antibody (affinity constant about 1.3 X 10 8 liters/- 50 7.6. Solid potassium sulphate) 

mole at 25* C.) in each of the individual wells is pro- 1. clear 2 ml of ascites fluid by centrifugation at 4000 

duced, the spots each containing less than 0.1 V/K ^ m 

moles of the respective antibody. A cross-reacting 2. Add 1M Tris-Cl solution to achieve final concen- 

(alpha subunit) monoclonal antibody 8D10 with an trationofOlM 

affinity constant of 1 X 10" liters/mole is used as a com- 55 3 Add sufrident of ^ otassium sulpnate . 

mon developing antibody for both the HCG and the Fmal concentration:=0 .5M. 

FSH assays. 4 Appiy ^ ^ xX& fj^ to ^ e T-gel column. 

Using 400 microhter aliquots of s^dard solutions 5. Walh the column with 0.1 MTris-CI buffer contain- 

containmg various known concentrations of HCG and . ' Jr" , Tf „ ' . c, , + 

FSH, dose response curves are obtained by methods 60 *S ».5M potassium sulphate, until protein profile (at 

analogous to those in Example 2, correlating fluores- A») returns to zero. 

cence ratios with HCG and FSH concentrations, the 6 - Elute absorbed protein using 0.1M Tns-Cl 

curve obtained with the higher affinity anti-HCG anti- buffer ^ the eluant 

body giving more concentration-sensitive results at the 7 - p ° o1 the fractions containing antibody activity and 

lower HCG concentrations whereas the curve from the 65 concentrate using Amicon 30 concentrater. 

lower affinity anti-HCG antibody is more concentra- 8. If HPHT purification is to be carried out, use 

tion-sensitive at the higher HCG concentrations. The HPHT chromatography Starting buffer during Step 7. 

plate is used to measure HCG and FSH concentrations 2. Labelling of Antibodies FITC/TRITC conjugation: 
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1. Dialyse the purified 1 g protein into 0.25M Carbon- 6. A method as claimed in claim 3, wherein said bind- 
ate-bicarbonate buffer, pH 9.0 to a concentration of 20 ing agents used have affinity constants for said anaiytes 
m g/ml. of tBe order of 10 10 or 10 n liters per mole. 

2. Add FITC/TRITC to achieve a 1:20 ratio with 7- A method as claimed in claim 3, wherein the vol- 
protein (i.e. 0.05 mg for every 1 mg of protein). 5 ™ e of said liquid sample is not more than 0.1 liter. 

3. Mix and incubate at 4° C. for 16-18 hrs. *■ A method as claimed in claim 3, wherein the vol- 

4. Separate the conjugated protein from unconju- ™* °f u <* uid *™pk is 400 to 1000 nucrohters. 
ated bv* 9> A metnod as claimed in claim 1, wherein said bmd- 

g a. Sephadex G-25 chromatography for FITC label, or ?ng agents loaded onto said support means are antibod- 

b. DEAE-Sephacel chromatography for TRJTC- 10 ies for Ae anaiytes whose concentrations are to be de- 

/FITC label termmed. 

u « 10. A method as claimed in claim 1, wherein said 

ipw f* 1 ?^ binding agents are labelled with markers enabling the 

« ™^ r if , tt o rt ^.ow™ , 4 concentration levels of said binding agents to be mea- 

0.005M Phosphate, pH 8.0 and 0.1 8M Phosphate, J5 sur£d 

pH 8.0 for (b). 1L A mct hod as claimed in claim 10, wherein said 

binding agents and said site-recognition reagents are 

2.87 ^ o.P.495 nm labelled with fluorescent markers such that at the indi- 

O.DJSOnm - 0.35 x O.D.495 nm vidual spots the assay technique for measuring frac- 

20 tional occupancy of the binding agents measures the 

I claim: ratios of the signals emitted by the fluorescent markers. 

1. A method for detennining the ambient concentra- 12. A device for use in determining the ambient con- 
tions of a plurality of anaiytes in a liquid sample of centrations of a plurality of anaiytes in a liquid sample 
volume V liters, comprising: of volume V liters, comprising a solid support means 

loading a plurality of different binding agents, each 25 having located thereon at high coating density at a 

being capable of reversibly binding an analyte plurality of spaced apart small spots a plurality of differ- 

which is or may be present in the liquid sample and ent binding agents, each binding agent being capable of 

is specific for said analyte as compared to the other reversibly binding an analyte which is or may be pres- 

components of the liquid sample, onto a support ent in said liquid sample and is specific for said analyte 

means at a plurality of spaced apart small spots 30 as compared to the other components of said liquid 

such that each spot has a high coating density of sample, each spot having not more than 0.1 V/K moles 

one of said binding agents but not more than 0. 1 of a single binding agent, where K liters/mole is the 

V/K moles of binding agent are present on any affinity constant of said single binding agent for reaction 

spot, where K liters/mole is the affinity constant of with the analyte to which it is specific, 

said binding agent for said analyte; 35 13. A device as claimed in claim 12, wherein each of 

contacting the loaded support means with the liquid said spots has a size of less than 1 m 2 . 

sample to be analyzed, such that each of the spots 1*. A device as claimed in claim 13, wherein each of 

is contacted in the same step with said liquid sam- said spots contains more than 10 4 molecules of binding 

pie, the amount of liquid used in said sample being agent. 

such that only an insignificant proportion of any 40 * 5 A kit for use in determining the ambient concen- 

analyte present in said liquid sample becomes tration of a Plurality of anaiytes in a liquid sample of 

bound to said binding agent specific for said ana- volume V liters, comprising: _ 

Ivt and a solid support means having located thereon at high 

meS^Tg a parameter representative of the frac- „ coatin 8 dfnsityat a plurality of spaced apart small 

tional Lupancy by said anaiytes of said binding « Jf* a P lura ^ . of « ch 

7 ' ' * & bmdmg agent being capable of reversibly binding 

agents at the spots by a competitive or non-com- analyte which is or may be present in said liquid 

petrtive assay technique using a site-recogmUon ^ b c for ^ ^ n ^ d 

reagent for each bmding ^agent capable of recogmz- fcQ £ other J^po^ of ^ Uquid each 

mg either the unfilled bmdmg sites or the filled ^ having not more than 0.1 V/K moles of a 

binding sites on said bmdmg agent, said site-recog- smg , e bindin ^ agentf where R ute^/moie h & c 

nition reagent being labelled with a marker en- affinity constant of said single binding agent for 

abling the amount of said reagent in the particular reaction with the analyte to which it is specific; 

location to be measured. a plurality of standard samples containing known 

2. A method as claimed in claim 1, wherein each of 55 concentrations of the anaiytes whose concentra- 
said spots has a size of less than 1 mm 2 . tions in the liquid sample are to be measured; and 

3. A method as claimed in claim 2, wherein each of a set G f labelled site-recognition reagents for reaction 
said spots contains more than 10* molecules of binding with filled or unfilled binding sites on said binding 
agent. agents. 

4. A method as claimed in claim 3, wherein each of 60 16. A kit as claimed in claim 15, wherein each of said 
said spots has less than 0.01 V/K moles of binding spots has a size of less than 1 mm 2 . 

agent. 17. A kit as claimed in claim 16, wherein each of said 

5. A method as claimed in claim 3, wherein said bind- spots contains more than 10 4 molecules of binding 
ing agents used have affinity constants for said anaiytes agent. 

of from 10* to 10 13 liters per mole. 65 ***** 
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ABSTRACT 



The present invention provides methods for determining the 
concentration of analytes in liquid samples in which the 
amount of binding agent having binding sites specific for a 
given analyte in the liquid sample is immobilized in a test 
zone on a solid support, the binding agent being divided into 
an array of spatially separated locations in the test zone. The 
concentration of the analyte is obtained by back-titrating the 
occupied binding agent with a developing agent having a 
marker and integrating the signal from each location in the 
array. The present invention also provides a method for 
determining a value representative of a fraction of binding 
sites of the binding agent which are occupied by the analyte, 
comprising immobilizing the specific binding agent on a 
solid support, wherein the specific binding agent used for the 
fractional occupancy is present in an amount less than 0.1 
V/K. moles, where V is the volume of the liquid sample and 
K is the association constant for the analyte specifically 
binding to the binding agent, and wherein the specific 
binding agent is divided into an array of spatially separated 
locations; contacting the support with the liquid sample; 
contacting the support with the developing agent; separating 
non -specifically bound developing agent and measuring the 
signal at each of the locations to obtain a value which 
represents the fraction of the binding sites occupied by the 
analyte at each location; and adding the measured values to 
provide a total signal which indicates the fraction of the 
binding sites of the binding agent occupied by the analyte. 
Test kits and devices used in practicing these methods are 
also disclosed. 

21 Claims, 3 Drawing Sheets 
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BINDING ASSAY 

This application is ihe U.S. national stage of PCT/GB94/ 
02814, filed Dec. 23, 1994. 

FIELD OF THE INVENTION 

The present invention relates to binding assays, e.g. for 
determining the concentration of analytes in liquid samples. 

BACKGROUND TO THE INVENTION 

It is known to measure the concentration of an analytc, 
such as a drug or hormone, in a liquid sample by contacting 
the liquid with a binding agent having binding sites specific 
for the analyte, separating the binding agent having analyte 
bound to it and measuring a value representative of the 
proportion of the binding sites on the binding agent that are 
occupied by analyte (referred to as the fractional 
occupancy). Typically, the concentration of the analyte in the 
liquid sample can then be determined by comparing the 
fractional occupancy against values obtained from a series 
of standard solutions containing known concentrations of 
analyte. 

In the past, the measurement of fractional occupancy has 
usually been carried out by back- titration with a labelled 
developing reagent using either so-called competitive or 
non-competitive methods. 

In the competitive method, the binding agent having 
analyte bound to it is back-titrated, either simultaneously or 
sequentially, with a labelled developing agent, which is 
typically a labelled version of the analyte. The developing 
agent can be said to compete for the binding sites on the 
binding agent with the analyte whose concentration is being 
measured. The fraction of the binding sites which become 
occupied with the labelled analyte can then be related to the 
concentration of the analyte in the liquid sample as 
described above. 

In the non-com petitive method, the binding agent having 
analyte bound to it is back-titrated with a labelled develop- 
ing agent capable of binding to either the bound analyte or 
the occupied binding sites on the binding agent. The frac- 
tional occupancy of Ihe binding sites can then be measured 
by detecting the presence of the labelled developing agent 
and, just as with competitive assays, related to the concen- 
tration of the analyte in the liquid sample as described 
above. 

In both competitive and non-competitive methods, the 
developing agent is labelled with a marker. A variety of 
markers have been used in the past, for example radioactive 
isotopes, enzymes, chemiluminescent markers and fluores- 
cent markers. 

In the field of immunoassay, competitive immunoassays 
have in general been carried out in accordance with design 
principles enunciated by Berson and Yalow, for instance in 
"Methods in Investigative and Diagnostic Endocrinology" 
(1973), pages 111 to 116. Berson and Yalow proposed Lhat 
in the performance of competitive immunoassays, maximum 
sensitivity is achieved if an amount of binding agent is used 
to bind approximately 30 to 50% of a low concentration of 
the analyte to be detected. In non-competitive 
immunoassays, maximum sensitivity is generally thought to 
be achieved by using sufficient binding agent to bind close 
to 100% of the analytc in the liquid sample. However, in 
both cases immunoassays designed in accordance with these 
widely accepted precepts require the volume of the sample 
to be known and the amount of binding agent used to be 
accurately known or known to be constant. 
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In International Patent Application WO84/01031, I dis- 
closed that the concentration of an analyte in a liquid sample 
can be measured by contacting the liquid sample with a 
small amount of binding agent having binding sites specific 

5 for the analytc. In this method, provided the amount of 
binding agent is small enough to have only an insignificant 
effect on the concentration of the analyte in the liquid 
sample, it is found that the fractional occupancy of the 
binding sites on the binding agent by the analyte is effec- 

10 tivcly independent of the volume of the sample. 

This approach is further refined in EP304,202 which 
discloses that the sensitivity and ease of development of the 
assays in WO84/01031 is improved by using an amount of 
binding agent less than 0.1 V/K moles located on a small area 

15 (or "microspot") of a solid support, where V is the volume 
of the sample and K is the equilibrium constant of the 
binding agent for the analyte. 

In WO93/08472, 1 disclosed a method of further improv- 
ing the sensitivity of binding assays by immobilising small 

20 amounts of binding agent at high density on a support in the 
form of a microspot. In this assay, a developing agent 
comprising a microsphere containing a marker, e.g. a fluo- 
rescent dye, is used to back-titrate the binding agent after it 
has been contacted with the liquid sample containing the 

25 analyte. As the microsphere can contain a large number of 
molecules of fluorescent dye, the sensitivity of the assay is 
improved as the signal from small amounts of analyte can be 
amplified. This amplification permits sensitive assays to be 
carried out even with microspots having an area of 1 mm 2 

30 or less and a surface density of binding agent in the range of 
1000 to 100000 molccules/^m 2 . 

SUMMARY OF THE INVENTION 

35 The present invention provides a method, device and test 
kit for carrying out a binding assay in which binding agent 
having binding sites specific for a given analyte in a liquid 
sample is immobilised in a test zone on a solid support, the 
binding agent being divided into an array of spatially 
40 separated locations in the test zone, wherein the concentra- 
tion of the analyte is obtained by integrating the signal from 
the locations in the array. 

Accordingly, in one aspect, the present invention provides 
a method for determining the concentration of an analyte in 
45 a liquid sample comprising: 

(a) locating binding agent having binding sites specific for 
the analyte in a test zone on a solid support, the binding 
agent being divided into an array of spatially separated 
locations; 

50 (b) contacting the support with the liquid sample so that 
a fraction of the binding sites at each location become 
occupied by analyte; 

(c) measuring a value of a signal representative of the 
5 - fraction of the binding sites occupied by the analyte for 

each individual location in the array; 

(d) integrating the signal value obtained for each location 
in the array to provide an integrated signal; and, 

(e) comparing the integrated signal to corresponding 
60 values, obtained from a series of standard solutions 

containing known concentrations of analyte, to deter- 
mine the concentration of the analyte in the liquid 
sample. 

Thus, in the present invention, the values of the signal 
65 from an array of locations in the test zone are used to 
determine the concentration of a single analyte. This is in 
contrast to the approach described in EP304,202, in which 
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the signal produced at a single location is used to determine maximum when the microspot reaches a small, but finite 

the concentration of an analyte. area, typically around 0.1 mm 2 . Further reducing the area of 

The array of locations of binding agent in the test zone can microspot leads to a reduction in the sensitivity of the 

be viewed sequentially, e.g. using a confocaJ microscope, binding assay. However, sensitivity is enhanced by subdi- 

and the signal value from each location integrated to provide 5 viding a microspot of any given area into multiple mini- 

the integrated signal. Alternatively, the array of locations of microspots, such that the total coated area occupied by the 

binding agent in the test zone can be viewed together, e.g. mini -microspots, and hence the total amount of binding 

using a charge coupled device (CCD) camera, with the agent remains the same. 

signal values from each location being measured simulta- In addition, the use of an array of microspots for each 

neously. 10 individual analyte allows the user to determine whether the 

Preferably, the signals representative of the fraction of the value obtained for any given microspot is in error by 

binding sites occupied by binding agent at each location are comparison with other microspots in the array, 

measured by back-titrating the binding agent with a devel- In a further aspect, the present invention provides a device 

oping agent having a marker, the developing agent being for determining the concentration of one or more analytes in 

capable of binding to unoccupied binding sites, bound 15 a liquid sample, the device comprising a solid support 

analyte or to occupied binding sites in a competitive or having one or more test zones, each test zone having 

non-competitive method, as described above. immobilised in it an amount of binding agent having binding 

The marker on the developing agent can be a radioactive sites specific for a given analyte in a liquid sample, the 

isotope, an enzyme, a chemiluminescent marker or a fluo- binding agent being divided into an array of spatially 

rescent marker. The use of fluorescent dye markers is 20 separated locations in the test zone, wherein the concentra- 

especially preferred as the fluorescent dyes can be selected tion of a given analyte is obtained by integrating signal 

to provide fluorescence of an appropriate colour range values from each location in the array, 

(excitation and emission wavelength) for detection. Fluo- In a further aspect, the present invention provides a kit for 

rescent dyes include coumarin, fluorescein, rhodamine and determining the concentration of one or more analytes in a 

Texas Red. Fluorescent dye molecules having prolonged 25 liquid sample, the kit comprising: 

fluorescent periods can be used, thereby allowing time- ( a ) a device comprising a solid support having one or 

resolved fluorescence to be used to measure the strength of more test zones, each test zone having immobilised in 

the fluorescent signal after background fluorescence has it an amount of binding agent having binding sites 

decayed. Advantageously, marker can be incorporated specific for a given analyte in a liquid sample, the 

within or on the surface of latex microspheres attached to the 30 binding agent being divided into an array of spatially 

developing agent. This allows a large quantity of marker to separated locations in the test zone; and/ 

be associated with each molecule of developing agent, (b) one or more deV eloping agents for determining the 

amplifying the signal produced by the developing agent. fraction of me bmdil)g sites of the bioding agem 

Preferably, the locations are microspots and the assay is occupied by a given analyte, the developing agents 

carried out using 4-40 (or more) microspots for each indi- 35 having markers , the developing agents being capable of 

vidual analyte, each microspot having an area less than binding to bound analyte, or unoccupied or occupied 

10000 fan\ the microspots being separated from each other binding sites of the binding agent ; 

by a distance of 100-1000 /an. The locations within the wherein the concentrat i OD c f a given analyte is obtained by 

array are referred to as "mim-microspots" in the relevant integrating signal values from the markers of the developing 

parts of the description that follow. 40 agent at each , ocation ^ lhe my ^ 

The present invention also allows the concentration of a 

plurality of analytes to be determined simultaneously by BRIEF DESCRIPTION OF THE DRAWINGS 

providing a plurality of test zones, each test zone having unexpC cted observation of a microspot area yielding 

immobilised in it a total amount of binding agent having a max i raiJ]ra sensitivity is thought to arise because a number 

binding sites specific for a given analyte in a liquid sample, 45 of opposing effects com bme to produce this outcome. These 

the binding agent bemg divided into an array of spatially effects are expIamed with reference to the accompanying 

separated locations in the test zone. figures in which* 

Preferably, in accordance with EP304,202, the total ™~ , . . ... - # r . - 

. c /. t < . . T ' r FIG. 1 represents bow the sensitivity of a binding assay 

amount of binding agent in each array that is specific for a , r ... c \ . , „ r. 

i - * i »u m i , ~\t ' » l. typically changes with area of microspot at constant binding 

given analyte is less than 0.1 V/K moles, where V is the 50 ^^y. 

volume of the sample applied to lhe test zone and K is the , 

association constant for analyte binding to the binding agent. 2 represent the typical variation in signal- lo-noise 

This ensures that the "ambient analyte" conditions described ratl ° 33 lhe of a micros P ot changes; 

in WO83/01031 are fulfilled regardless of the analyte con- FIG - 3 represents how diffusion constraints on analyte 

centration. 55 binding to the binding agent change as the area of the 

One way of immobilising binding agent on a support at a microspot changes; 

discrete location such as a microspot is to use technology FIG. 4 shows how the error in signal measurement 

comparable to the techniques used in ink-jet or laser printers, changes as the area of the microspot changes; 

in the case of microspots typically providing spots having FIG. 5 shows a comparison between the microspot array 

diameter of about 80 /^m. Alternatively, if larger locations 60 of the present invention and a single microspot of the prior 

are required, a micropipette can be used to control the art; and 

amount of binding agent immobilised at a location on a FIG. 6 shows binding agent immobilised as an array of 

support. li Des i n an alternative embodiment of the invention. 

The present invention is based on the observation that as 

the area of a microspot is reduced from a high value, such 65 DETAILED DESCRIPTION 

as 5 mm 2 towards zero, the sensitivity of the binding assay Note that in FIGS. 1 to 4, the exact shape of the curves 

(represented by the lower limit of detection) reaches a shown will depend on a number of parameters, including the 
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physico-chemical properties (ie association and dissociation shown in FIG. 1. Thus, the overall consequence is that, as 

rate constants) of the binding agent, the viscosity of the microspot area falls to zero, the binding assay becomes 

analyte containing solution to which the microspot is totally insensitive. 

exposed, the specific activity of the label used, etc. However, it is desirable to develop sensitive miniaturised 

In all the figures, value A denotes the area of a microspot 5 binding assays using microspots of the smallest possible size 

typically used in the prior art (typically 1 mm 2 ). In all the containing vanishingly small amounts of binding agent, that 

figures, the density of binding agent is kept constant. nave rapid kinetics to minimise the time taken to carry out 

FIG. 1 shows the experimentally observed variation of me assay, 

sensitivity as the area of a microspot is reduced. In the The present invention improves sensitivity and reduces 

present context, sensitivity can be defined as the lower limit 10 binding assay incubation times by exploiting the contradic- 

of detection which is given by the error (s.d) with which it tory effects discussed above to maximal advantage. This is 

is possible to measure zero signal. As FIG. 1 shows, as the done by sub-dividing the total amount of binding agent into 

area is reduced from value A, the sensitivity of the binding an array of spatially separated locatioas such as "mini- 

assay reaches a maximum and then declines as the area of microspots", to reduce diffusion constraints, and integrating 

the microspot is further reduced towards zero. 15 the signals representative of the fractional occupancy of 

Some of the opposing factors leading to this observation binding agent at each location to obtain a total signal greater 

are depicted in FIGS. 2 to 4. man would have been achieved by using a single microspot 

FIG. 2 shows how the signal-to-noise ratio associated C£ * ual in to me total area occupied by the minimi- 

with the measurement of the occupancy of the binding sites cros P ots comprising the mimmicrospot array, 

of the binding agent changes as the size of the microspot 2 0 ^P 1 ^' inlcr aha > tnat the l °tal amount of binding 

decreases towards zero, assuming equilibrium has been a S ent used can bc madc cven smaller than in the prior art 

reached. As microspot area is reduced from value A, the where a balance between kinetics and signal-to-noise rela- 

fractional occupancy of the binding sites of the binding live 10 statistical errors had to be made to optimise sensi- 

agent reaches a plateau value as the concentration of binding tivitv - present invention therefore can improve the 

agent falls below 0.01/K. Therefore, the signal per unit area signal-to-noise ratio associated with measuring the analyte 

from markers on developing agent used to measure the bound to binding agent, whilst reducing the diffusion con- 

occupancy of the binding sites by analyte will also reach a straints associated with each microspot in the array, 

plateau. As the background noise per unit area remains Moreover, the increasing statistical errors observed in the 

approximately constant, so the signal-to-noise ratio will P rior art microspot size is reduced are obviated, as the 

likewise increase to a plateau value as the concentration of signal generated from the occupied binding sites by analyte 

binding agent falls below 0.01/K. 30 m me individual microspots is integrated over the array to 

FIG. 3 shows how diffusion constraints change as the area P rovidc aa ^grated signal, thereby retaining the signal 

of a microspot is reduced. "Diffusion constraints" restrict the measurement advantage observed for larger microspots. 

rate at which analyte migrates towards and binds to the FIG - 5 illustrates how a single microspot of the prior art 

binding agent. As FIG. 3 shows, the diffusion constraints can be divided into an array of 25 microspots containing an 

decrease as microspot size decreases, ie the kinetics of the 35 equivalent total amount of binding agent, 

binding process arc faster for smaller microspots, implying Nevertheless, other arrangements or geometries of bind- 

that thermodynamic equilibrium in the system is reached ing agent providing assays yielding the same benefits can be 

more rapidly. envisaged, see for instance FIG. 6 which shows binding 

On a molecular level, this phenomenon can be pictured as a S ent immobilised as lines forming a grid (see the shaded 

follows. When a microspot containing binding agent is 40 areas). This configuration likewise has the effect of reducing 

placed in a liquid sample containing analyte, the binding the diffusion constraints whilst maintaining the total area 

agent binds analyte, depleting the local concentration of the coated with binding agent (e.g. an antibody) to obviate the 

analyte as compared to the liquid sample as a whole. This increasing statistical errors and associated loss of sensitivity 

leads to a concentration gradient being established in the observed as the amount of binding agent is reduced, 

vicinity of the microspot until thermodynamic equilibrium is 45 The amount and distribution of the binding agent in the 

reached. This process is found to be slower for larger locations comprising the array depends on a variety of 

microspots the diffusion constraint being approximately factors including the diffusion characteristics of the analyte, 

proportional to microspot radius. When the occupancy of the the nature and viscosity of the liquid sample containing the 

binding sites on the binding agent has reached an equilib- analyte and the protocol used during incubation. However, 

rium value, the concentration of analyte in the liquid sample 5Q given the guidance here the skilled person can readily 

is uniform. However, equilibrium is reached more rapidly in determine, either experimentally or by computer modelling, 

the case of microspots of smaller size, implying that, for any the optimal arrangement or geometry of array for any given 

incubation time less than that required to reach equilibrium binding assay, 
in the case of the larger spot, the fractional occupancy of the 

binding sites on the smaller spot is greater. EXAMPLE 

However, as microspot area decreases, so the amount of 55 Conjugation of Anti-TSH (Anti-Thyroid 

binding agent and the level of signal from developing agent Stimulating Hormone) Mouse Monoclonal 

will likewise decrease. This leads to an increase in the Antibody to Fluorescent Hydrophilic Latex 

statistical errors in the measurement of the signal from a Microspheres 

marker on a developing agent, which tend to infinity as the ^ 

microspot area tends to zero (see FIG. 4). 60 1. 10 mg of fluorescent hydrophilic latex microspheres in 

It can be seen that a consideration of the signal-to-noise ml double distilled water were added to 0.5 ml of 1% 

ratio and diffusion constraints indicate an increase in the TWEEN 20, surface-active agent, shaken for 15 min at room 

sensitivity of a binding assay as the area of a microspot is temperature and centrifuged at 8° C. for 10 min at 20,000 

decreased. However, these factors are opposed by an npm in a MSE High -Spin 20 ultracentrifuge. 

increase in the statistical error of signal measurement as the 65 2. The pellet was dispersed in 2 ml of 0.05M MES 

microspot area decreases. These factors combine to produce (2-[N-Morpholino] ethanesulfonic acid) buffer, pH6.1 and 

the observed variation of sensitivity with microspot area centrifuged. 
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Sample 


Total Fluorescent Signal 


incubation 


(arbitrary 


units) 


times (mins) 


Micros pot 


Mini-microspot 


30 


85 ± 7 


111 * 13 


60 


118 ± 15 


149 £ 16 


120 


141 ± 21 


178 * 16 


Overnight 


185 * 20 


191 *23 



10 



15 



20 



3. Step 2 was repeated. 

4. The pellet was dispersed in 0.8 ml MES buffer. 

5. 2 mg of anti-TSH monoclonal developing antibody in 
100 fi\ were added to the microspheres and shaken for 15 
min at room temperature. 

6. 100 /d of 0.25% ethyl-3 (3-dimethyl amino) propyl 
carbodimide hydrochloride were added to the mixture and 
shaken for 2 hours at room temperature. 

7. 10 mg glycine in 100 /d of MES buffer were added to 
the mixture, shaken for a further 30 min and centrifuged. 

8. The pellet was dispersed in 2 ml of 1% BSA (Bovine 
Serum Albumin), shaken for 1 hour at room temperature and 
centrifuged. 

9. The pellet was dispersed in 2 ml of 1% BSA, shaken for 
1 hour at room temperature and centrifuged. 

10. The pellet was dispersed in 2 ml of 0.1M phosphate 
buffer, pH7.4 and centrifuged. 

11. Step 10 was repeated twice. 

12. The pellet was dispersed in 2 ml of 1% BSA contain- 
ing 0.1 sodium azide and stored at 4° C. 

Comparison of Kinetics of Micro Versus Mini- 
micro Capture Antibody Microspols in a Sandwich 
TSH (Thyroid Stimulating Hormone) Assay 

1. Anti-TSH capture antibody microspots (diameter 1.1 25 
mm, area»l0 6 /on 2 ) were made by depositing 0.5 /d of 200 
^g/ml antibody solution on each of 16 Dynatech black 
MicroFluor Microtitre wells, the droplets were aspirated 
immediately, the wells blocked with SuperBlock from 
Pierce for 30 min at room temperature and washed with 
0.1 M phosphate buffer, pH7.4. 

2. The mini-microspots (diameter 0.16 mm) were made 
using an piezoelectric ink-jet print-head with an anti-body 
solution concentration of 1 mg/ml and droplets of approxi- 
mately 100 pi pico liter for an array of 49 (7x7) mini- 
microspots per microtitre wells (total coated antibody area= 
10 6 /an 2 ) for 16 wells. The wells were blocked with 
SuperBlock and washed with phosphate buffer as above. The 
coated antibody density for both micro and mini-microspots 
are estimated to be 2xl0 4 IgG//*m 2 . 

3. 200 /d of plasma containing 1 wU/ml of TSH was added 
to all the microtitre wells and shaken at room temperature. 
At 30, 60, 120 min and 18 hours (overnight), four wells 
containing the microspots and mini-microspots were washed 
with phosphate buffer containing 0.1% TWEEN 20, then 
incubated with 200 /d of anti-TSH developing antibody 
conjugated to bydrophilic latex microspheres in Tris-HCl 
buffer (50 ,wg/ml) for 30 min at room temperature and 
washed with phosphate -TWEEN 20 buffer. The wells were 
then scanned with a laser scanning confocal microscope 
equiped with an Argon/Krypton laser. 

Results 



Conclusion 

Significantly higher mean responses were observed 
between 30 and 120 mins in the mini-micro spot samples, 



30 



35 



40 



45 



50 



55 



60 



65 



while the overnight controls did not show significant differ- 
ences. This demonstrates that the mini-microspots have 
faster kinetic for the association of analyte with the capture 
antibody, and could be used to reduce incubation times. 
The invention claimed is: 

1. A method for determining the concentration of at least 
one analyte in a liquid sample, said method comprising, for 
each analyte, the steps of: 

(a) immobilizing a specific binding agent including bind- 
ing sites specific for the analyte on a solid support, 
wherein the specific binding agent used to determine 
the concentration of the analyte is present in an amount 
less than 0.1 V/K moles, where V is the volume of the 
liquid sample and K is the association constant for the 
analyte specifically binding to the specific binding 
agent, and wherein said specific binding agent is 
divided into an array of spatially separated locations; 

(b) contacting the support with the sample so that a 
fraction of the binding sites of the specific binding 
agent specific for the analyte specifically binds the 
analyte; 

(c) contacting the support with a developing agent 
labelled with a signal-producing marker such that the 
labelled developing agent binds to unoccupied binding 
sites, to specifically bound analyte or to the binding 
sites with specifically bound analyte; 

(d) separating non-specifically bound developing agent 
from the solid support and measuring the signal pro- 
duced by the marker at each of the locations in the array 
to obtain a value which represents the fraction of the 
binding sites occupied by the analyte at each location; 

(e) adding the values obtained at the locations in the array 
to provide a total signal; and 

(f) comparing the total signal to corresponding values 
obtained from a series of standard solutions containing 
known concentrations of the analyte, to determine the 
concentration of the analyte in the liquid sample. 

2. The method according to claim 1, wherein the specific 
binding agent is divided into between 4 and 40 locations. 

3. The method according to claim 1, wherein the locations 
have an area of about 10000 /mi 2 , the locations being 
separated from each other by a distance of 100 to 1000 /ma. 

4. The method according to claim 1, wherein the concen- 
tration of a plurality of different analytes in the liquid sample 
arc determined using a plurality of arrays on said support. 

5. The method according to claim 1, wherein (i) the 
specific binding agent is an antibody and the analyte is an 
antigen or (ii) the specific binding agent is an oligonucle- 
otide and the analyte is a nucleic acid. 

6. A method for determining the concentration of at least 
one analyte in a liquid sample, said method comprising, for 
each analyte, the steps of: 

(a) immobilizing a specific binding agent including bind- 
ing sites specific for the analyte on a solid support, 
wherein the specific binding agent used to determine 
the concentration of the analyte is present in an amount 
less than 0.1 V/K moles, where V is the volume of the 
liquid sample and K is the association constant for the 
analyte specifically binding to the specific binding 
agent, and wherein said specific binding agent is 
divided into an array of spatially separated locations; 

(b) contacting the support with the liquid sample so that 
a fraction of the binding sites of the binding agent 
specific for the analyte specifically bind the analyte; 

(c) contacting the support with a developing agent 
labelled with a signal-producing marker such that the 
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labelled developing agent binds to unoccupied binding 
sites, to specifically bound analyte or to the binding 
sites with specifically bound analyte; 

(d) separating non-specifically bound developing agent 
from the solid support and measuring the signal pro- 
duced by the marker at each of the locations in the array 
to obtain a value which represents the fraction of the 
binding sites occupied by the analyte at each location; 
and 

(e) adding the values obtained at the locations in the array 
to provide a total signal which indicates the concen- 
tration of the analyte in the liquid sample. 

I, The method according to claim 6 wherein the specific 
binding agent is divided into between 4 and 40 locations. 

8. The method according to claim 6, wherein the locations 
are in an area of about 10000 /im 2 , the locations being 
separated from each other by a distance of 100 to 1000 fxm. 

9. The method according to claim 6, wherein the concen- 
trations of a plurality of different analytes in the liquid 
sample are determined using a plurality of arrays on said 
support. 

10. ITie method according to claim 6, wherein (i) the 
specific binding agent is an antibody and the analyte is an 
antigen or (ii) the specific binding agent is an oligonucle- 
otide and the analyte is a nucleic acid. 

II. A method for determining the concentration of at least 
one analyte in a liquid sample, said method employing a 
solid support on which is immobilized, for each analyte, a 
specific binding agent including binding sites specific for the 
analyte, wherein the specific binding agent used to deter- 
mine the concentration of the analyte is present in an amount 
less than 0.1 V/K moles, where V is the volume of the liquid 
sample and K is the association constant for the analyte 
specifically binding to the specific binding agent, and 
wherein said specific binding agent is divided into an array 
of spatially separated locations, said method comprising the 
steps of: 

(a) contacting the support with the sample so that a 
fraction of the binding sites of the specific binding 
agent specific for the analyte specifically binds the 
analyte; 

(b) contacting the support with a developing agent 
labelled with a signal-producing marker such that the 
labelled developing agent binds to unoccupied binding 
sites, to specifically bound analyte or to the binding 
sites with specifically bound analyte; 

(c) separating non-specifically bound developing agent 
from the solid support and measuring the signal pro- 
duced by the marker at each of the locations in the array 
to obtain a value which represents the fraction of the 
binding sites occupied by the analyte at each location; 

(d) adding the values obtained at the locations in the array 
to provide a total signal; and 

(e) comparing the total signal to corresponding values 
obtained from a series of standard solutions containing 
known concentrations of the analyte, to determine the 
concentration of the analyte in the liquid sample. 

12. The method according to claim II, wherein the 
specific binding agent is divided into between 4 and 40 
locations. 



13. The method according to claim 11, wherein the 
locations have an area of about 10000 ^m 2 , the locations 
being separated from each other by a distance of 100 to 1000 
pan. 

5 14. The method according to claim 11, wherein the 
concentrations of a plurality of different analytes in the 
liquid sample are determined using a plurality of arrays on 
said support. 

15. The method according to claim 11, wherein the 
J0 specific binding agent is an antibody and the analyte is an 

antigen. 

16. The method according to claim 11, wherein the 
specific binding agent is an oligonucleotide and the analyte 
is a nucleic acid. 

17. A method for determining a value representative of a 
15 fraction of binding sites of a specific binding agent including 

binding sites specific for an analyte which binding sites are 
occupied by the analyte present in a liquid sample, said 
method comprising the steps of: 

(a) immobilizing the specific binding agent on a solid 
20 support, wherein the specific binding agent used for the 

fractional occupancy determination is present in an 
amount less than 0.1 V/K moles, where V is the volume 
of the liquid sample and K is the association constant 
for the analyte specifically binding to the specific 
25 binding agent, and wherein said specific binding agent 
is divided into an array of spatially separated locations; 

(b) contacting the support with the liquid sample so that 
a fraction of the binding sites of the binding agent 
specific for the analyte specifically bind the analyte; 

30 (c) contacting the support with a developing agent 
labelled with a signal-producing marker such that the 
labelled developing agent binds to unoccupied binding 
sites, to specifically bound analyte or to the binding 
sites with specifically bound analyte; 

35 (d) separating non-specifically bound developing agent 
from the solid support and measuring the signal pro- 
duced by the marker at each of the locations in the array 
to obtain a value which represents the fraction of the 
binding sites occupied by the analyte at each location; 

40 and 

(e) adding the values obtained at the locations in the array 
to provide a total signal which indicates the fraction of 
the binding sites in the specific binding agent occupied 
by the analyte. 

18. The method according to claim 17, wherein the 
specific binding agent is divided into between 4 and 40 
locations. 

19. The method according to claim 17, wherein the 
locations are in an area of about 10000 pmi 2 t the locations 
being separated from each other by a distance of 100 to 1000 
fun. 

20. The method according to claim 17, wherein the 
fraction of occupied binding sites is determined for a plu- 
rality of different analytes in the liquid sample using a 

55 plurality of arrays on said support. 

21. The method according to claim 17, wherein (i) the 
specific binding agent is an antibody and the analyte is an 
antigen or (ii) the specific binding agent is an oligonucle- 
otide and the analyte is a nucleic acid. 

60 
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[57] ABSTRACT 

Methods and compositions for modeling the transcriptional 
responsiveness of an organism to a candidate drug involve 
(a) detecting reporter gene product signals from each of a 
plurality of different, separately isolated cells of a target 
organism, wherein each cell contains a recombinant con- 
struct comprising a reporter gene operatively linked to a 
different endogenous transcriptional regulatory element of 
the target organism such that the transcriptional regulatory 
element regulates the expression of the reporter gene, and 
the sum of the cells comprises an ensemble of the transcrip- 
tional regulatory elements of the organism sufficient to 
model the transcriptional responsiveness of said organism to 
a drug; (b) contacting each cell with a candidate drug; (c) 
detecting reporter gene product signals from each cell; (d) 
comparing reporter gene product signals from each cell 
before and after contacting the cell with the candidate drug 
to obtain a drug response profile which provides a model of 
the transcriptional responsiveness of said organism to the 
candidate drug. 

8 Claims, No Drawings 
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METHODS FOR DRUG SCREENING 
BACKGROUND 

The field of the invention is pharmaceutical drug screen- 
ing. Pharmaceutical research and development is a multi- 5 
billion dollar industry. Much of these resources are con- 
sumed in efforts to focus the specificity of lead compounds. 
In addition, many programs are aborted after decades of 
costly yet fruitless efforts to limit side effects or toxicity of 
candidate drugs. Accordingly, tools that can abbreviate the 
research and discovery phase of drug development are 
desirable. Several in vitro or cell culture-based methods 
have been described for identifying compounds with a 
particular biological effect through the activation of a linked 
reporter. Gadski et al. (1992) EP 92304902.7 describes 
methods for identifying substances which regulate the syn- 15 
thesis of an apolipoprotein; Evans et al. (1991) U.S. Pat No. 
4,981,784 describes methods for identifying ligand for a 
receptor and Farr et al. (1994) WO 94/17208 describes 
methods and kits utilizing stress promoters to determine 
toxicity of a compound. 20 

In general, the principle that has been applied in the 
existing pharmaceutical industry for the discovery and 
development of new lead compounds for drugs has been the 
establishment of sensitive and reliable in vitro assays for 
purified enzymes, and then screening large numbers of 25 
compounds and culture supematants for any ability to inhibit 
enzyme activity. The present invention exploits the recent 
advances in genome science to provide for the rapid screen- 
ing of large numbers of compounds against a systemic target 
comprising substantially all targets in a pathway, organism, 30 
etc. for rare compounds having the ability to inhibit the 
protein of interest The invention described herein, in effect, 
turns the drug discovery process inside out This invention 
provides information on the mechanism of action of every 
compound that affects cells, regardless of the target In 35 
addition, the relative specificity of all lead compounds is 
immediately established. 

SUMMARY OF THE INVENTION 

The invention provides methods and compositions for 40 
estimating the physiological specificity of a candidate drug. 
In general, the subject methods involve (a) detecting reporter 
gene product Signals from each of a plurality of different 
separately isolated cells of a target organism, wherein each 
of said cells contains a recombinant construct comprising a 45 
reporter gene operatively linked to a different endogenous 
transcriptional regulatory element (e.g. promoter) of said 
target organism such that said transcriptional regulatory 
element regulates the expression of said reporter gene, 
wherein said plurality of cells comprises an ensemble of the 50 
transcriptional regulatory elements of said organism suffi- 
cient to model the transcriptional responsiveness of said 
organism to a drug; (b) contacting each said cell with a 
candidate drug; (c) detecting reporter gene product signals 
from each of said cells; (d) comparing said reporter gene 55 
product signals from each of said cells before and after 
contacting each of said cells with said candidate drug to 
obtain a drug response profile; wherein said drug response 
profile provides an estimate of the physiological specificity 
or biological interactions of said candidate drug. 60 

DETAILED DESCRIPTION OF THE 
INVENTION 

The Genome Reporter Matrix. 

65 

The invention provides methods and compositions for 
estimating the physiological specificity of a candidate drug 
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by modeling the transcriptional responses of the target 
organism with an ensemble of reporters, the expressions of 
which are regulated by transcription regulatory genetic 
elements derived from the genome of the target organism. 
The ensemble of reporting cells comprises as comprehensive 
a collection of transcription regulatory genetic elements as is 
conveniently available for the targeted organism so as to 
most accurately model the systemic transcriptional response. 
Suitable ensembles generally comprise thousands of indi- 
vidually reporting elements; preferred ensembles are sub- 
stantially comprehensive, i.e. provide a transcriptional 
response diversity comparable to that of the target organism. 
Generally, a substantially comprehensive ensemble requires 
transcription regulatory genetic elements from at least a 
majority of the organism's genes, and preferably includes 
those of all or nearly all of the genes. We term such a 
substantially comprehensive ensemble a genome reporter 
matrix. 

It is frequently convenient to use an ensemble or genome 
reporter matrix derived from a lower eukaryote or common 
animal model to obtain preliminary information on drug 
specificity in higher eukaryotes, such as humans. Because 
yeast such as Saccharomyces cerevisiae* is a bona fide 
eukaryote, there is substantial conservation of biochemical 
function between yeast and human cells in most pathways, 
from the sterol biosynthetic pathway to the Ras oncogene. 
Indeed, the absence of many effective antifungal compounds 
illustrates how difficult it has been to find therapeutic targets 
that would selectively kill fungal but not human cells. One 
example of a shared response pathway is sterol biosynthesis. 
In human cells, the drug Mevacor (lovastatin) inhibits 
HMG-CoA reductase, the key regulatory enzyme of the 
sterol biosynthetic pathway. As a result the level of a 
particular regulatory sterol decreases, and the cells respond 
by increased transcription of the gene encoding the LDL 
receptor. In yeast, Mevacor also inhibits HMG-CoA reduc- 
tase and lowers the level of a key regulatory sterol. Yeast 
cells respond in an analogous fashion to human cells. 
However, yeast do not have a gene for the UDL receptor. 
Instead, the same effect is measured by increased transcrip- 
tion of the ERG 10 gene, which encodes aceioacetyl Co A 
thiolase, an enzyme also involved in sterol synthesis. Thus 
the regulatory response is conserved between yeast and 
humans, even though the identity of the responding gene is 
different 

Advantages of the Genome Reporter Matrix as a 
Vehicle for Pharmaceutical Development 

The advantages of the subject methods over prior art 
screening methods may be illustrated by examples. Consider 
the difference between an in vitro assay for HMG-CoA 
reductase inhibitors as presently practiced by the pharma- 
ceutical industry, and an assay for inhibitors of sterol bio- 
synthesis as revealed by the ERG 10 reporter In the case of 
the former, information is obtained only for those rare 
compounds that happen to inhibit this one enzyme. In 
contrast in the case of the ERG 10 reporter, any compound 
that inhibits nearly any of the approximately 35 steps in the 
sterol biosynthetic pathway will, by lowering the level of 
intracellular sterols, induce the synthesis of the reporter. 
Thus, the reporter can detect a much broader range of targets 
than can the purified enzyme, in this case 35 times more than 
the in vitro assay. 

Drugs often have side effects that are in part due to the 
lack of target specificity. However, the in vitro assay of 
HMG-CoA reductase provides no information on the sped- 
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ficity of a compound. In contrast, a genome reporter matrix 
reveals the spectrum of other genes in the genome also 
affected by the compound. In considering two different 
compounds both of which induce the ERG 10 reporter, if one 
compound affects the expression of 5 other reporters and a 
second compound affects the expression of 50 other report- 
ers, the first compound is, a priori, more likely to have fewer 
side effects. Because the identity of the reporters is known 
or determinable, information on other affected reporters is 
informative as to the nature of the side effect. A panel of 
reporters can be used to test derivatives of the lead com- 
pound to determine which of the derivatives have greater 
specificity than the first compound. 

As another example, consider the case of a compound that 
does not affect the in vitro assay for HMG-CoA reductase 
nor induces the expression of the ERG 10 reporter. In the 
traditional approach to drug discovery, a compound that 
does not inhibit the target being tested provides no useful 
information. However, a compound having any significant 
effect on a biological process generally has some conse- 
quence on gene expression. A genome reporter matrix can 20 
thus provide two different kinds of information for most 
compounds. In some cases, the identity of reporter genes 
affected by the inhibitor evidences to how the inhibitor 
functions. For example, a compound that induces a cAMP- 
dependent promoter in yeast may affect the activity of the 25 
Ras pathway. Even where the compound affects the expres- 
sion of a set of genes that do not evidence the action of the 
compound, the matrix provides a comprehensive assessment 
of the action of the compound that can be stored in a 
database for later analyses. A library of such matrix response 
profiles can be continuously investigated, much as the 
Spectral Compendiums of chemistry arc continually refer- 
enced in the chemical arts. For example, if the database 
reveals that compound X alters the expression of gene Y, and 
a paper is published reporting that the expression of gene Y 
is sensitive to, for example, the inositol phosphate signaling 
pathway, compound X is a candidate for modulating the 
inositol phosphate signaling pathway. In effect the genome 
reporter matrix is an informational translator that lakes 
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For example, taxol, a recent advance in potential breast 
cancer therapies, has been shown to interfere with tubulin- 
based cytoskcletal elements. Hence, a dominant mutant form 
of tubulin provides a response profile informative for breast 
cancer therapies with similar modes of action to taxol. 
Specifically, a dominant mutant form of tubulin is intro- 
duced into all the strains of the genome reporter matrix and 
the effect of this dominant mutant, which interferes with the 
microtubule cytoskeleton, evaluated for each reporter. Thus, 
any new compound that induces the same response profile as 
the dominant tubulin mutant would provide a candidate for 
a taxol -like pharmaceutical. 

In addition, the genome reporter matrix can be used to 
genetically create or model various disease states. In this 
way, pathways present specifically in the disease state can be 
targeted. For example, the specific response profile of trans- 
forming mutant Ras2 va/19 identifies Rasl^ 19 induced 
reporters. Here, the matrix, in which each unit contains the 
Ras2 vo/19 mutation is used to screen for compounds that 
restore the response profile to that of the matrix lacking the 
mutation. 

Though these examples are directed to the development of 
human therapeutics, informative response profiles can often 
be obtained in nonhuman reporter matrices. Hence, for 
disease causing genes with yeast homologs, even if the 
function of the gene is not known, a dominant form of the 
gene can be introduced into a yeast-based reporter matrix to 
identify disease state specific pathways for targeting. For 
example, a reporter matrix comprising the yeast mutant 
Ras2 van9 provides a discovery vehicle for pathways specific 
to the human analog, the oncogene Ras2 Vfl " 2 . 

Application of Novel Combinatorial Chemistries 
with the Genome Reporter Matrix. 

Among the most important advances in drug development 
have been advances in combinatorial synthesis of chemical 
libraries. In conventional drug screening with purified 
enzyme targets, combinatorial chemistries can often help 
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already have been found to affect the expression of that gene 
This tool should dramatically shorten the research and 
discovery phase of drug development, and effectively lever- 
age the value of the publicly available research portfolio on 
all genes. 

In many cases, a drug of interest would work on protein 
targets whose impact on gene expression would not be 
known a priori. The genome reporter matrix can neverthe- 
less be used to estimate which genes would be induced or 
repressed by the drug. In one embodiment, a dominant 
mutant form of the gene encoding a drug-targeted protein is 
introduced into all the strains of the genome reporter matrix 
and the effect of the dominant mutant, which interferes with 
the gene product's normal function, evaluated for each 
reporter. This genetic assay informs us which genes would 55 
be affected by a drug that has a similar mechanism of action. 
In many cases, the drug itself could be used to obtain the 
same information. However, even if the drug itself were not 
available, genetics can be used to predetermine what its 
response profile would be in the genome reporter matrix. 60 
Furthermore, it is not necessary to know the identity of any 
of the responding genes. Instead, the genetic control with the 
dominant mutant sorts the genome into those genes that 
respond and those that do not. Hence, if drugs that disrupt a 
given cellular function were desired, dominant mutants for 65 
such function introduced into the genome reporter matrix 
reveal what response profile to expect for such an agent. 



inhibit the target enzyme but with some different and desir- 
able property. However, conventional methods would fail to 
recognize a molecule having a substantially divergent speci- 
ficity. The genome reporter matrix offers a simple solution to 
recognizing new specificities in combinatorial libraries. Spe- 
cifically, pools of new compounds are tested as mixtures 
across the matrix. If the pool has any new activity not 
present in the original lead compound, new genes are 
affected among the reporters. The identity of that gene 
provides a guide to the target of the new compound. Fur- 
thermore, the matrix offers an added bonus that compensates 
for a common weakness in most chemical syntheses. Spe- 
cifically, most syntheses produce the desired product in 
greatest abundance and a collection of other related products 
as contaminants due to side reactions in the synthesis. 
Traditionally the solution to contaminants is to purify away 
from them. However, the genome reporter matrix exploits 
the presence of these contaminants. Syntheses can be 
adjusted to make them less specific with a greater number of 
side reactions and more contaminants to determine whether 
anything in the total synthesis affects the expression of target 
genes of interest. If there is a component of the mixture with 
the desired activity on a particular reporter, that reporter can 
be used to assay purification of the desired component from 
the mixture. In effect, the reporter matrix allows a focused 
survey of the effect on single genes to compensate for the 
impurity of the mixture being tested. 
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Isoprenoids are a specially attractive class for the genome 
reporter matrix. In nature, isoprenoids are the champion 
signaling molecules. Isoprenoids are derivatives of the five 
carbon compound isoprene, which is made as an interme- 
diate in cholesterol biosynthesis. Isoprenoids include many 5 
of the most famous fragrances, pigments, and other biologi- 
cally active compounds, such as the antifungal sesquiterpe- 
noids, which plants use defensively against fungal infection. 
There are roughly 10,000 characterized isoprene derivatives 
and many more potential ones. Because these compounds 10 
are used in nature to signal biological processes, they are 
likely to include some of the best membrane permeant 
molecules. 

Isoprenes possess another characteristic that lends itself 
well to drug discovery through the genome reporter matrix. *5 
Pure isoprenoid compounds can be chemically treated to 
create a wide mixture of different compounds quickly and 
easily, due to the particular arrangement of double bonds in 
the hydrocarbon chains. In effect, isoprenoids can be 
mutagenized from one form into many different forms much 20 
as a wild-type gene can be mutagenized into many different 
mutants. For example, vitamin D used to fortify milk is 
produced by ultraviolet irradiation of the isoprene derivative 
known as ergosterol. New biologically active isoprenoids 
are generated and analyzed with a genome reporter matrix as 25 
follows. First a pure isoprenoid such as limonene is tested to 
determine its response profile across the matrix. Next, the 
isoprenoid (e.g. limonene) is chemically altered to create a 
mixture of different compounds. This mixture is then tested 
across the matrix. If any new responses are observed, then 30 
the mixture has new biologically active species. In addition 
the identity of the reporter genes provides information 
regarding what the new active species does, an activity to be 
used to monitor its purification, etc. This strategy is also 
applied to other mutable chemical families in addition to 35 
isoprenoids. 



Applications of the Genome Reporter Matrix in 
Antibiotic and Antifungal Discovery. 
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Fungi are important pathogens on plants and animals and 
make a major impact on the production of many food crops 
and on animal, including human, health. One major diffi- 
culty in the development of antifungal compounds has been 
the problem of finding pharmaceutical targets in fungi that 45 
are specific to the fungus. The genome reporter matrix offers 
a new tool to solve this problem. Specifically, all molecules 
that fail to. elicit any response in the Saccharomyces reporter 
are collected into a set, which by definition must be either 
inactive biologically or have a very high specificity. A 50 
reporter library is created from the targeted pathogen such as 
Cryptococcus, Candida, Aspergillus, Pneumocystis etc. All 
molecules from the set that do not affect Saccharomyces are 
tested on the pathogen, and any molecule that elicits an 
altered response profile in the pathogen in principle identi- 55 
fies a target that is pathogen-specific. As an example, a 
pathogen may have a novel signaling enzyme, such as an 
inositol kinase that alters a position on the inositol ring that 
is not altered in other species. A compound that inhibits that 
enzyme would affect the signaling pathway in the pathogen, 60 
and alter a response profile, but due to the absence of that 
enzyme in other organisms, would have no effect. By 
sequencing the reporter genes affected specifically in the 
target fungus and comparing the sequence with others in 
Genbank, one can identify biochemical pathways that are 65 
unique to the target species. Useful identified products 
include not only agents that kill the target fungus but also the 



identification of specific targets in the fungus for other 
pharmaceutical screening assays. 

The identification of compounds that kill bacteria has 
been successfully pursued by the pharmaceutical industry 
for decades. It is rather simple to spot a compound that kills 
bacteria in a spot test on a petri plate. Unfortunately, growth 
inhibition screens have provided very limited lead com- 
pound diversity. However, there is much complexity to 
bacterial physiology and ecology that could offer an edge to 
development of combination therapies for bacteria, even for 
compounds that do not actually kill the bacteria) cell. 
Consider for example the bacteria that invade the urethra 
and persist there through the elaboration of surface attach- 
ments known as umbrae. Antibiotics in the urine stream 
have limited access to the bacteria because the urine stream 
is short-lived and infrequent. However, if one could block 
the synthesis of the Umbrae to detach the bacteria, existing 
therapies would become more effective. Similarly, if the 
chemotaxis mechanism of bacteria were crippled, the ability 
of bacteria to establish an effective infection would, in some 
species, be compromised. A genome reporter matrix for a 
bacterial pathogen that contains reporters for the expression 
of genes involved in chemotaxis or fimbrae synthesis, as 
examples, identifies not only compounds that do kill the 
bacteria in a spot test, but also those that interfere with key 
steps in the biology of the pathogen. These compounds 
would be exceedingly difficult to discover by conventional 
means. 

Applications of Human Cell Based Genome 
Reporter Matrices. 

A genome reporter matrix based on human cells provides 
many important applications. For example, an interesting 
application is the development of antiviral compounds. 
When human cells are infected by a wide range of viruses, 
the cells respond in a complex way in which only a few of 
the components have been identified. For example, certain 
interferons are induced as is a double-stranded RNase. Both 
of these responses individually provides some measure of 
protection. A matrix that reports the induction of interferon 
genes and the double stranded RNase is able to detect 
compounds that could prophylactically protect cells before 
the arrival of the virus. Other protective effects may be 
induced in parallel. The incorporation of a panel of other 
reporter genes in the matrix is used to identify those com- 
pounds with the highest degree of specificity. 

Use of the Genome Reporter Matrix. 

The procedure to be followed in the subject methods will 
now be outlined. The initial step involves determining the 
basal or background response profile by detecting reporter 
gene product signals from each of a plurality of different, 
separately isolated cells of a target organism under one or 
more of a variety of physical conditions, such as temperature 
and pH, medium, and osmolarity. As discussed above, the 
target organism may be a yeast, animal model, human, plant, 
pathogen, etc. Generally, the cells are arranged in a physical 
matrix such as a microtiter plate. Each of the cells contains 
a recombinant construct comprising a reporter gene opera- 
tively linked to a different endogenous transcriptional regu- 
latory element of said target organism such that said tran- 
scriptional regulatory element regulates the expression of 
said reporter gene. A sufficient number of different recom- 
binant cells are included to provide an ensemble of tran- 
scriptional regulatory elements of said organism sufficient to 
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model the transcriptional responsiveness of said organism to 
a drug. In a preferred embodiment, the matrix is substan- 
tially comprehensive for the selected regulatory elements, 
e.g. essentially all of the gene promoters of the targeted 
organism are included. Other cis-acting or trans-acting tran- 
scription regulatory regions of the targeted organism can 
also be evaluated. In one embodiment, a genome reporter 
matrix is constructed from a set of lacZ fusions to a 
substantially comprehensive set of yeast genes. The fusions 
are preferably constructed in a diploid cell of the a/a mating 
type to allow the introduction of dominant mutations by 
mating, though haploid strains also find use with particularly 
sensitive reporters for certain functions. The fusions are 
conveniently arrayed onto a microtiter plate having 96 wells 
separating distinct fusions into wells having defined alpha- 
numeric X-Y coordinates, where each well (defined as a 
unit) confines a cell or colony of cells having a construct of 
a reporter gene operatively joined to a different transcrip- 
tional promoter. Permanent collections of these plates are 
readily maintained at —80° C. and copies of this collection 
can be made and propagated by simple mechanics and may 
be automated with commercial robotics. 

The methods involve detecting a reporter gene product 
signal for each cell of the matrix. A wide variety of reporters 
may be used, with preferred reporters providing conve- 
niently detectable signals (e.g. by spectroscopy). Typically, 
the signal is a change in one or more electromagnetic 
properties, particularly optical properties at the unit. As 
examples, a reporter gene may encode an enzyme which 
catalyzes a reaction at the unit which alters light absorption 
properties at the unit, radiolabeled or fluorescent tag-labeled 
nucleotides can be incorporated into nascent transcripts 
which are then identified when bound to oligonucleotide 
probes, etc. Examples include f^galactosidase, invertase, 
green fluorescent protein, etc. Invertase fusions have the 
virtue that functional fusions can be selected from complex 
libraries by the ability of invertase to allow those genes 
whose expression increases or decreases by measuring the 
relative growth on medium containing sucrose with or 
without the compound of interest. Electronic detectors for 
optical, radiative, etc. signals are commercially available, 
e.g. automated, multi-well colorimetric detectors, similar to 
automated ELISA readers. Reporter gene product signals 
may also be monitored as a fiincdon of other variables such 
as stimulus intensity or duration, time (for dynamic response 
analyses), etc. 

In a preferred embodiment, the basal response profiles are 
determined through the colorimetric detection of a lacZ 
reaction product. The optical signal generated at each well is 
detected and linearly transduced to generate a corresponding 
digital electrical output signal. The resultant electrical out- 
put signals are stored in computer memory as a genome 
reporter output signal matrix data structure associating each 
output signal with the coordinates of the corresponding 
microtiter plate well and the stimulus or drug. This infor- 
mation is indexed against the matrix to form reference 
response profiles that are used to determine the response of 
each reporter to any milieu in which a stimulus may be 
provided. 

After establishing a basal response profile for the matrix, 
each cell is contacted with a candidate drug. The term drug 
is used loosely to refer to agents which can provoke a 
specific cellular response. Preferred drugs are pharmaceuti- 
cal agents, particularly therapeutic agents. The drug induces 
a complex response pattern of repression, silence and indue- 65 
tion across the matrix (i.e. a decrease in reporter activity at 
some units, an increase at others, and no change at still 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



others). The response profile reflects the cell's transcrip- 
tional adjustments to maintain homeostasis in the presence 
of the drug. While a wide variety of candidate drugs can be 
evaluated, it is important to adjust the incubation conditions 
(e.g. concentration, time, etc.) to preclude cellular stress, and 
hence insure the measurements of pharmaceutically relevant 
response profiles. Hence, the methods monitor transcrip- 
tional changes which the cell uses to maintain cellular 
homeostasis. Cellular stress may be monitored by any con- 
venient way such as membrane potential (e.g. dye exclu- 
sion), cellular morphology, expression of stress response 
genes, etc. In a preferred embodiment, the compound treat- 
ment is performed by transferring a copy of the entire matrix 
to fresh medium containing the first compound of interest. 

After contacting the cells with the candidate drug, the 
reporter gene product signals from each of said cells is again 
measured to deterrnine a stimulated response profile. The 
basal of background response profile is then compared with 
(e.g. subtracted from, or divided into) the stimulated 
response profile to identify the cellular response profile to 
the candidate drug. The cellular response can be character- 
ized in a number of ways. For example, the basal profile can 
be subtracted from the stimulated profile to yield a net 
stimulation profile. In another embodiment, the stimulated 
profile is divided by the basal profile to yield an induction 
ratio profile. Such comparison profiles provide an estimate 
of the physiological specificity of the candidate drug. 

In another embodiment of the invention, a matrix of 
hybridization probes corresponding to a predetermined 
population of genes of the selected organism is used to 
specifically detect changes in gene transcription which result 
from exposing the selected organism or cells thereof to. a 
candidate drug. In this embodiment, one or more cells 
derived from the organism is exposed to the candidate drug 
in vivo or ex vivo under conditions wherein the drug effects 
a change in gene transcription in the cell to maintain 
homeostasis. Thereafter, the gene transcripts, primarily 
mRNA, of the cell or cells is isolated by conventional 
means. The isolated transcripts or cDNAs complementary 
thereto are then contacted with an ordered matrix of hybrid- 
ization probes, each probe being specific for a different one 
of the transcripts, under conditions wherein each of the 
transcripts hybridizes with a corresponding one of the 
probes to form hybridization pairs. The ordered matrix of 
probes provides, in aggregate, complements for an ensemble 
of genes of the organism sufficient to model the transcrip- 
tional responsiveness of the organism, to a drug. The probes 
are generally immobilized and arrayed onto a solid substrate 
such as a microtiter plate. Specific hybridization may be 
effected, for example, by washing the hybridized matrix 
with excess non-specific oligonucleotides. A hybridization 
signal is then detected at each hybridization pair to obtain a 
matrix-wide signal profile. A wide variety of hybridization 
signals may be used; conveniently, the cells are pre-labeled 
with radionucleotides such that the gene transcripts provide 
a radioactive signal that can be detected in the hybridization 
pairs. The matrix-wide signal profile of the drug-stimulated 
cells is then compared with a matrix-wide signal profile of 
negative control cells to obtain a specific drug response 
profile. 

The invention also provides means for computer-based 
qualitative analysis of candidate drugs and unknown com- 
pounds. A wide variety of reference response profiles may be 
generated and used in such analyses. For example, the 
response of a matrix to loss of function of each protein or 
gene or RNA in the cell is evaluated by introducing a 
dominant allele of a gene to each reporter cell, and deter- 
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mining the response of the reporter as a function of the 
mutation. For this purpose, dominant mutations are pre- 
ferred but other types of mutations can be used. Dominant 
mutations arc created by in vitro mutagenesis of cloned 
genes followed by screening in diploid cells for dominant 
mutant alleles. 

In an alternative embodiment, the reporter matrix is 
developed in a strain deficient for the UPF gene function, 
wherein the majority of nonsense mutations cause a domi- 
nant phenotype, allowing dominant mutations to be con- 
structed for any gene. UPF1 encodes a protein that causes 
the degradation of MRNA's that, due to mutation, contain 
premature termination codons. In routants lacking UPF1 
function most nonsense mutations encode short truncated 
protein fragments. Many of these interfere with normal 
protein function and hence have dominant phenotypes. Thus 
in a upfl mutant, many nonsense alleles behave as dominant 
mutations (see, e.g. Leeds, R et al. (1992) Molec. Cell 
Biology. 12:2165-77). 

The resultant data identify genetic response profiles. 
These data are sorted by individual gene response to deter- 
mine the specificity of each gene to a particular stimulus. A 
weighting matrix is established which weights the signals 
proportionally to the specificity of the corresponding report- 
ers. The weighting matrix is revised dynamically, incorpo- 
rating data from every screen. A gene regulation function is 
then used to construct tables of regulation identifying which 
cells of the matrix respond to which mutation in an indexed 
gene, and which mutations affect which cells of the matrix. 

Response profiles for an unknown stimulus (e.g. new 
chemicals, unknown compounds or unknown mixtures) may 
be analyzed by comparing the new stimulus response pro- 
files with response profiles to known chemical stimuli. Such 
comparison analyses generally take the form of an indexed 
report of the matches to the reference chemical response 35 
profiles, ranked according to the weighted value of each 
matching reporter. If there is a match (i.e. perfect score), the 
response profile identifies a stimulus with the same target as 
one of the known compounds upon which the response 
profile database is built. If the response profile is a subset of 40 
cells in the matrix stimulated by a known compound, the 
new compound is a candidate for a molecule with greater 
specificity than the reference compound. In particular, if the 
reporters responding uniquely to the reference chemical 
have a low weighted response value, the new compound is 45 
concluded to be of greater specificity. Alternatively, if the 
reporters responding uniquely to the reference compound 
have a high weighted response value, the new compound is 
concluded to be active downstream in the same pathway. If 
the output overlaps the response profile of a known refer- 
ence compound, the overlap is sorted by a quantitative 
evaluation with the weighting matrix to yield common and 
unique reporters. The unique reporters are then sorted 
against the regulation tables and best matches used to 
deduce the candidate target. If the response profile does not 
either overlap or match a chemical response profile, then the 
database is inadequate to infer function and the response 
profile may be added to the reference chemical response 
profiles. 

The response profile of a new chemical stimulus may also 60 
be compared to a known genetic response profile for target 
gene(s). If there is a match between the two response 
profiles, the target gene or its functional pathway is the 
presumptive target of the chemical. If the chemical response 
profile is a subset of a genetic response profile, the target of 65 
the drug is downstream of the mutant gene but in the same 
pathway. If the chemical response profile includes as a 
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subset a genetic response profile, the target of the chemical 
is deduced to be in the same pathway as the target gene but 
upstream and/or the chemical affects additional cellular 
components. If not, the chemical response profile is novel 
and defines an orphan pathway. 

While described in terms of cells comprising reporters 
under the transcriptional control of endogenous regulatory 
regions, there are a number of other means of practicing the 
invention. For example, each unit of a genome reporter 
matrix reporting on gene expression might confine a differ- 
ent oligonucleotide probe capable of hybridizing with a 
corresponding different reporter transcript Alternatively, 
each unit of a matrix reporting on DNA-protein interaction 
might confine a cell having a first construct of a reporter 
gene operatively joined to a targeted transcription factor 
binding site and a second hybrid construct encoding a 
transcription activation domain fused to a different structural 
gene, i.e. a one-dimensional one-hybrid system matrix. 
Alternatively, each unit of a matrix reporting on protein- 
protein interactions might confine a cell having a first 
construct of a reporter gene operatively joined to a targeted 
transcription factor binding site, a second hybrid construct 
encoding a transcription activation domain fused to a dif- 
ferent constitutionally expressed gene and a third construct 
encoding a DNA-binding domain fused to yet a different 
constitutionally expressed gene, Le. a two-dimensional two- 
hybrid system matrix. 

The following examples are offered by way of illustration 
and not by way of limitation. 



EXAMPLES 

1 . Transcriptional promoter-reporter gene matrix 

A) Construction of a physical matrix stimulated with the 
drug mevinolin (lovastatin, Meracon). 

Mevinolin is a compound known to inhibit cholesterol 
biosynthesis. Initially, the maximal non-toxic (as measured 
by cell growth and viability) concentration of mevinolin on 
the reporter cells was determined by serial dilution to be 25 
ug/ml. To produce a mevinolin-stimulated matrix, each well 
of 60 microliter plates is filled with 100 ul culture medium 
containing 25 ug/ml mevinolin in a 2% ethanol solution. An 
aliquot of each member of the reporter matrix is added to 
each well allowing for a dilution of approximately 1:100. 
The cells are incubated in the medium until the turbidity of 
the average reporter increases by 20 fold. Each well is then 
quantified for turbidity as a measure of growth, and is treated 
with a lysis solution to allow measurement of fJ-galactosi- 
dase from each fusion. 

B) Generation of an output signal matrix data structure. 
Both the turbidity and the B-galactosidase are read on 

commercially available microtiter plate readers (e.g. Bio- 
Rad) and the data captured as an ASCII file. From this file, 
the value of the individual cells in the reporter matrix to a 
2% ethanol solution in the reference response profile is 
subtracted. The difference corresponds to the mevinolin 
response profile. This file is converted in the computer to a 
table indexed by the response of each cell to the inhibitor. 
For example, the genes encoding acetoacetyl-CoA thiol as e 
and squalene synthase increase 10 fold, while SIR3, and 
LEU2, two unrelated genes, remain unchanged. The 
response of the reporter matrix to other compounds is 
similarly determined and stored as output response profiles. 

C) Comparison of Signal Matrix data structure with a 
Signal Matrix database. 
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A physical matrix is constructed as describe above except 
the mevinolin is replaced with an unknown test compound. 
The resultant response profile is compared to the response 
profiles of a library of known bioactive compounds and 
analyzed as described above. For example, if the test com- 5 
pound output profile shows both acetoacetyl-CoA thiolase 
and squalene synthase gene induced, then the output profile 
matches that expected of an inhibitor of cholesterol synthe- 
sis. If the response profile has fewer other cells affected than 
the response profile to mevinolin, the unknown compound is 1Q 
a candidate for greater specificity. If the response profile of 
the new chemical affects fewer other reporters than the 
response profile to mevinolin, and if the other reporters 
affected by mevinolin have a lower weighted value, then the 
compound is a candidate for greater specificity. If the 
response profile has more different cells affected than the 
response profile to mevinolin, then the compound is a 
candidate for less specificity. In the case where mixtures of 
compounds are tested, the highest weighted responses are 
evaluated to determine whether they can be deconvoluted 
into the response profile of two different compounds, or of 
two different genetic response profiles. 

2. Reporter transcript-oligonucleotidc hybridization probe 
matrix: Construction of stimulated physical matrix and 
generation of an output signal matrix data structure. ^ 

Unlabeled oligonucleotide hybridization probes comple- 
mentary to the mRNA transcript of each yeast gene are 
arrayed on a silicon substrate etched by standard techniques 
(e.g. Fodor et al. (1991) Science 252, 767). The probes are 
of length and sequence to ensure specificity for the corre- 
sponding yeast gene, typically about 24-240 nucleotides in 
length. 

A confluent HeLa cell culture is treated with 15 ug/ml 
mevinolin in 2% ethanol for 4 hours while maintained in a 
humidified 5% C0 2 atmosphere at 37° C. Messenger RNA 
is extracted, reverse transcribed and fluorophore-labeled 35 
according to standard methods (Sambrook et al., Molecular 
Cloning, 3rd ed.). The resultant cDNA is hybridized to the 
array of probes, the array is washed free of unhybridized 
labeled cDNA, the hybridization signal at each unit of the 
array quantified using a confocal microscope scanner 40 
(instruments by Molecular Devices and Affymetrix), and the 
resultant matrix response data stored in digital form. 

3. Two-dimensional two-hybrid matrix 

A) Construction of stimulated physical matrix. 

The two-dimensional two-hybrid (see, e.g. Chien et al. 45 
(1991) PNAS, 88, 9578)matrix is designed to screen for 
compounds that specifically affect the interaction of two 
proteins, e.g. the interaction of a human signal transducer 
and activator of transcription (STAT) with an interleukin 
receptor. T\vo hybrid fusions are generated by standard 50 
methods: each strain contains a portion of the targeted 
human STAT gene, fused to a portion of a yeast or bacterial 
gene encoding a DNA binding domain (e.g. GAL4: 1-147). 
The DNA sequence recognized by that DNA binding domain 
(e.g. UAS C ) is inserted in place of the enhancer sequence 5' 55 
to the selected reporter (e.g. lacZ). The strain also contains 
another fusion consisting of an intracellular portion of the 
targeted receptor gene whose protein product interacts with 
the STAT This receptor gene is fused with a gene fragment 
encoding a transcriptional activation domain (e.g. go 
GAL4:768-881). 

B) Generation of signal matrix data structure. 

Both the turbidity and the galactosidase are read on 
commercial microtiter plate readers (BioRad) and the data 
captured as an ASCII file. 65 

C) Comparison of signal matrix data structure with data- 
base. 
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Data are analyzed for those compounds that block the 
interaction of the two human proteins by reducing the signal 
produced from the reporter in the various strains containing 
pairs of human proteins. The output is processed to identify 
compounds with a large impact on a reporter whose expres- 
sion is dependent on a single pair of interacting human 
proteins. An inverted weighting matrix is used to evaluate 
these data as preferred compounds do not affect even the 
least specific reporters in the matrix. 

All publications and patent applications cited in this 
specification are herein incorporated by reference as if each 
individual publication or patent application were specifically 
and individually indicated to be incorporated by reference. 
Although the foregoing invention has been described in 
some detail by way of illustration and example for purposes 
of clarity of understanding, it will be readily apparent to 
those of ordinary skill in the art in light of the teachings of 
this invention that certain changes and modifications may be 
made thereto without departing from the spirit or scope of 
the appended claims. 

What is claimed is: 

1. A method for modeling of the transcriptional respon- 
siveness of an organism to a candidate drug which has an 
effect on gene transcription in cells of said organism, com- 
prising steps: 

(a) detecting reporter gene product signals from each of a 
plurality of different, separately isolated cells of a target 
organism, wherein each of said cells contains a recom- 
binant construct comprising a reporter gene operatively 
linked to a different endogenous transcriptional regu- 
latory element of said target organism such that said 
transcriptional regulatory element regulates the expres- 
sion of said reporter gene, wherein said plurality of 
cells comprises an ensemble of the transcriptional 
regulatory elements of said organism sufficient to 
model the transcriptional responsiveness of said organ- 
ism to a drug; 

(b) contacting each of said cells with a candidate drug 
under conditions, wherein said cells maintain homeo- 
stasis; 

(c) detecting reporter gene product signals from each of 
said cells; 

(d) comparing said reporter gene product signals from 
each of said cells before and after contacting each of 
said cells with said candidate drug to obtain a drug 
response profile; 

wherein said drug response profile provides a model of 
the transcriptional responsiveness of said organism to 
said candidate drug. 

2. A method according to claim 1, said ensemble com- 
prising a majority of all different transcriptional regulatory 
elements of said organism. 

3. A method according to claim 1, said drug being a 
candidate human therapeutic. 

4. A method according to claim 1, wherein said cells are 
yeast cells. 

5. A method according to claim 1, wherein said cells are 
bacterial cells. 

6. A method according to claim 1, wherein said cells are 
human cells. 

7. A method according to claim 1, wherein the reporter 
gene is the lacZ gene, the suc2 gene, or a gene encoding a 
green fluorescent protein. 

8. A method according to claim 1, wherein said cells are 
eukaryotic cells. 

***** 
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superfamiiies. Pearson found thai modem matrices and "In- 
scaling" of raw scores improve results considerably. He also 
reported that the rigorous Smith- Waterman algorithm worked 
slightly better than fasta, which was in turn more effective 
than BLAST. 

Very large scale analyses of matrices have been performed 
(10), and Henikoff and Henikoff (11) also evaluated the 
effectiveness of blast and fasta. Their test with blast 
considered the ability to detect homologs above a predeter- 
mined score but had no penalty for methods which also 
reported large numbers of spurious matches. The Henikoffs 
searched the swiss-PROT database (12) and used prosite (13) 
to define homologous families. Their results showed that the 
BLOSUM62 matrix (14) performed markedly better than the 
extrapolated PAM-series matrices (15), which previously had 
been popular. 

A crucial aspect of any assessment is the data that are used 
to test the ability of the program to find homologs. But in 
Pearson's and the Henikoffs' evaluations of sequence com- 
parison, the correct results were effectively unknown. This is 
because the superfamiiies in pir and prositc are principally 
created by using the same sequence comparison methods 
which are being evaluated Interdependent of data and 
methods creates a "chicken and egg" problem, and means for 
example, that new methods would be penalized for correctly 
identifying homologs missed by older programs. For instance, 
immunoglobulin variable and constant domains are dearly 
homologous, but pir places them in different superfamiiies. 
The problem is widespread: each superfamily in PIR 48.00 with 
a structural homolog is itself homologous to an average of 1 6 
other pir superfamiiies (16). 

To surmount these sorts of difficulties, Sander and Schnei- 
der (17) used protein structures to evaluate sequence com- 
parison. Rather than comparing different sequence compari- 
son algorithms, their work focused on determining a ienuth- 
dependent threshold of percentage identity, above which all 
proteins would be of similar structure. A result of this analysis 
was the hssp equation; it states that proteins with 25% identity 
over 80 residues will have similar structures, whereas shorter 
alignments require higher identity. (Other studies also have 
used structures (18-20), but these focused on a small number 
of model proteins and were principally oriented toward eval- 
uating alignment accuracy rather than homology detection ) 
A general solution to the problem of scoring comes from 
statistical measures (i.e., E-values and P-values) based on the 
extreme value distribution (21). Extreme value scoring was 
implemented analytically in the blast program using the 
Karlin and Altschul statistics (22, 23) and empirical ap- 
proaches have been recently added to fasta and ssearch. In 
addition lo being heralded as a reliable means of recognizing 
significantly similar proteins (24, 25), the mathematical trac- 
tabiliiy of statistical scores "is a crucial feature of the biast 
algorithm" (1). The validity of this scoring procedure has been 
tested analytically and empirically (see ref. 2 and references in 
ref. 24). However, all large empirical tests used random 
sequences that may lack the subtle structure found within 
biological sequences (26, 27) and obviously do not contain anv 
real homologs. Thus, although many researchers have sug'- 
gested that statistical scores be used to rank matches (24. 25, 
28) there have been no large rigorous experiments on biolog- 
ical data to determine the degree to which such rankings are 
superior. °^ 

A Database for Testing Homology Detection. Since the 
discovery thai the structures of hemoglobin and myoglobin are 
very similar though their sequences are not (29). it has been 
apparent that comparing structures is a more powerful (if less 
convenient) way to recognize distant evolutionary relation- 
ships than comparing sequences. If two proteins show a high 
degree of similarity in their structural details and function, it 
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121 K° babie lhal h3VC an ^^onary relationship 
though their sequence similarity mav be low 

The recent growth of protein stricture information com- 
bined with the comprehensive evolutionary classification in 
he scop database (4. 5) have allowed us to overcome previous 
limitations. With these data, we can evaluate the performance 
of sequence comparison methods on real protein sequences 
whose relationships are known confidently. The scop database 
uses structural information to recognize distant homologs, the 
large majority of which can be determined unambieuousrv. 
Tnese superfamiiies, such as the globins or the immunoglobu- 
lins would be recognized as related bv the vast majority of the 
biological community despite the lack of high sequence sim- 

From scop wc extracted the sequences of domains of 
proteins in the Protein Data Bank (pdb) (30) and created two 
databases. One (PDBWD-B) has domains, which were all <90% 
idemica to any other, whereas (PDB40D-B) had those <*Q% 
identical. The databases were created by first sorting all 
protein domains in scop by their quality and making a list The 
highest quality domain was selected for inclusion in the 
database and removed from the list. Also removed from the list 
and discarded) were all other domains above the threshold 
level of identity to the selected domain. This process was 
repeated until the list was empty. The pdimod-b database 
contains 1,323 domains, which have 9,044 ordered pairs of 
distant relationships, or -0.5% of the total 1.749.006 ordered 
pairs. In PDB9or>B. the Z079 domains have 53.988 relation- 
ships, representing 12% of all pairs. Low complexity regions 
of sequence can achieve spurious high scores, so these were 
masked in both databases by processing with the sec program 
(27) using recommended parameters: 12 1.8 Z0. The databases 
used in thus paper are available from http://sss.stanford.edu/ 
sss/. and databases derived from the current version of scop 
may be found at http://scop.mrc-lmb.cam.ac.uk/scop/. 

Analyses from both databases were generally consistent, but 
PDB40D-B focuses on distantly related proteins and reduces the 
heavy overrepresentation in the pdb of a small number of 
families (31, 32). whereas pdbwd-b (with more sequences) 
improves evaluations of statistics. Except where noted other- 
wise, the distant homolog results here are from pdeuod-b. 
Although the precise numbers reported here are specific to the 
structural domain databases used, we expect the trends to be 
general. 

Assessment Data and Procedure. Our assessment of se- 
quence comparison may be divided into four different major 
categories of tests. First, using just a single sequence compar- 
ison algorithm at a time, we evaluated the effectiveness of 
different scoring schemes. Second, we assessed the reliability 
of scoring procedures, including an evaluation of the validity 
of statistical scoring. Third, wc compared sequence compari- 
son algorithms (using the optimal scoring scheme) to deter- 
mine their relative performance. Fourth, we examined the 
distribution of homologs and considered the power of pairwise 
sequence comparison to recognize them. All of the analyses 
used the databases of structurally identified homologs and a 
new assessment criterion. 

The analyses tested blast (1). version L4.9MP, and wu- 
blast: (2). version 2.0al3MP. Also assessed was the fasta 
package, version 3.0t76 (3), which provided fasta and the 
ssearch implementation of Smith-Waterman (8) For 

"f?/ RC V « d « AS ^ WC Uicd BLOSUM < 5 with gap penalties 
W-I (7. 16). The default parameters and matrix (blo- 
SUM62) were used for blast and wu-BlAST2. 

The "Coverage Vs. Error" Plot. To test a particular protocol 
(comprising a program and scoring scheme), each sequence 
from the database was used as a query to search the database. 
This yielded ordered pairs of querv and target sequences with 
associated scores, which were sorted, on the basis of their 
scores, from best to worst. The ideal method would have 
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the substitution matrix scores for each position in' the align- 
ment and subtracting gap penalties. In blast, a measure 
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extent of errors. Second, ssearch, wu-blastz and facta 
letup = 1 perform best, though blast and fasta ktup = 2 
detect most of the relationships found by the best procedures 
and are appropriate for rapid initial searches. 

The homologous proteins that are found bv sequence com- 
parison can be distinguished with high reliability from the huge 
number of unrelated pairs. However, even the best database 
searching procedures tested fail to find the large majority of 
distant evolutionary relationships at an acceptable error rate 
Thus, if the procedures assessed here fafl to find a reliable 
match, it does not imply that the sequence is unique; rather it 
indicates that any relatives it might have are distant ones.** 

-Additional and updated information about this work, indudins 
supplementary figures, may be found at http://rn.itanford.edu /as/. 
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of Response dated 12/04/03 
In USSN 09/937,060 



Subject: RE: [Fwd: T xicol gy Chip] 
Date: Mon. 3 Jul 2000 06:09:45 -0400 
From: "Afshari.Cyndiia" <afshari@niehs.nih.gov*> 
To: ""Diana Hamlei-Cox"* <dianahc@inc\ie.com> 



You car. see the list of clones that we have on our 12:-: chip at 
http: sanuel .r.iehs.r.ih.ccv saps -guest clcnesrch. cfr. 
W selected a subset of genes (2000K) that we believed critics" 
response and basic cellular processes and added a set cf clc-es - 
this. We have included a set of control genes (80-) that were se V e- : e^- 
the KHGRI because they did not change across a larae set o- a — av ~ " 
exper . _ - .... 

s; 
va: 



penmer.ts . However, we have found chat some of these aenes chk-ce 
gnficantly after tox treatments and are in the process" cf looitlnc a- --e 
nation of each of these 80* genes across our experiments. 
Our chips are constantly changing and being updated and we hope --a- c— 
data will lead us to what the toxchip should realiv be. 



nope this answers your question 
Cindy Afshari 



> From: Diana Hamlet-Cox 

> Sent; Monday, June 26, 2000 8:52 PM 

> To: afshariGniehs .nih.gov 

> Subject: [Fwd: Toxicology Chip] 
> 

> Dear Dr. Afshari, 
> 

> Since I have not yet had a response from Sill Grigg, perhaps he was not 

> the right person to contact. 
> 

> Can you help me in this matter? 1 don't need to know the sequences 

> necessarily, but I would like very much to know what types of sequences 

> are being used, e.g., GPCRs (more specific?), ion channels, etc 
> 

> Diana Hamlet -Cox 
> 

> Original Message 

> Subject: Toxicology Chip 

> Date: Mon. 19 Jun 2000 18:31:48 -0700 

> From; Diana Hamlet-Cox <dianahc9incyte.com> 

> Organization: Incyte Pharmaceuticals 

> To: griggGniehs.nih.gov 
> 

> Dear Colleague: 
> 

> 1 am doing literature research on the use of expressed aenes as 

> pharmacotoxicology markers, and found the Press Release' dated February 

> 29. 2000 regarding the work of the NIEHS in this area. 1 would like zo 

> know if there, is a resource I can access (or you could provide?) that 

> would give me a list of the 12,000 genes that are on your Human ToxChip 

> Microarray. In particular. I am interested in the criteria used zo 

> select sequences for the ToxChip, including any control sequences 

> included in the microarray. 



> 



Thank you for your assistance in this request. 



> Diana Hamlet-Cox. Ph.D. . 

> Incyte Genomics, Inc. 
> 

> — 
> 



> This email message zs for zhe sole use of zhe inzended recipier.z s a-z 

> may contain csnfidenzial and privileged information sx&jecz ro 

> azzomey-clienz privilege. Any unauzhorired review, use. disclosure cr 

> diszribuzion is prohibited. If you are noz zhe inzended recipient. 

> please contact zhe sender by reply ezail and destroy all ccp-es ci zhe 

> original message. 

> 
> 



V, 



07/31/3000 10:34 AM 
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ABSTRACT The recent ability to sequence whole genomes 
allows ready access to all genetic material The approaches 
outlined here allow automated analysis of sequence for the 
synthesis of optimal primers in an automated multiplex 
oligonucleotide synthesizer (AMOS). The efficiency is such 
that all ORFs for an organism can be amplified by PCR. The 
resulting amplicons can be used directly in the construction of 
DNA arrays or can be cloned for a large variety of functional 
analyses. These tools allow a replacement of single-gene 
analysis with a highly efficient whole-genome analysis. 



The genome sequencing projects have generated and will 
continue to generate enormous amounts of sequence data. The 
genomes of Saccharomyces cerei'isiae, Escherichia colL Hae- 
mophilus influenzae (1 ). Mycoplasma genitalium (2). and Meth* 
anococcus jannaschii (3) have been completely sequenced. 
Other model organisms have had substantial portions of their 
genomes sequenced as well including the nematode Cacno- 
rhabdais elegans (4) and the small flowering plant Arabidopsis 
thaliana (5). This massive and increasing amount of sequence 
information allows the development of novel experimental 
approaches to identify gene function. 

One standard use of genome sequence data is to attempt to 
identify the functions of predicted open reading frames 
(ORFs) within the genome by comparison to genes of known 
function. Such a comparative analysis of all ORFs to existing 
sequence data is fast, simple, and requires no experimentation 
and is therefore a reasonable first step. While finding sequence 
homologies/motifs is not a substitute for experimentation, 
noting the presence of sequence homology and/or sequence 
motifs can be a useful first step in finding interesting genes, in 
designing experiments and. in some cases, predicting function 
However, this type of analysis is frequently un informative. For 
example, over one-half of new ORFs in 5. cerevisiae have no 
known function (6). If this is the case in a well studied organism 
such as yeast, the problem will be even worse in organisms that 
are less well studied or less manipulate. A large, experimen- 
tally determined gene function database would make homol- 
ogy/motif searches much more useful. 

Experimental analysis must be performed to thoroughly 
understand the biological function of a gene product. Scaling 
up from classical "cottage industry" one-gene-oriented ap- 
proaches to whole-genome analysis would be verv expensive 
and laborious. It is clear that novel strategies are necessary to 
efficiently pursue the next phase of the genome projects— 
whole-genome experimental analysis to explore gene expres- 
sion, gene product function, and other genome functions. 
Model organisms; such as S. cerevisiae, will be extremely 

The publication costs of this anicic were defrayed in pan bv page charge 
payment. This article must therefore be hereby marked "ad^nisemcnr in 
accordance with 16 U.S.C. 51734 solely to indicate this facu 
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important in the development of novel whole-genome analysis 
techniques and, subsequently, in improving our understanding 
of other more complex and less manipulate organisms. 

The genome sequence can be systematically used as a tool 
to understand ORFs, gene product function', and other ge- 
nome regions. Toward this end, a directed strategy has been 
developed for exploiting sequence information as a means of 
providing information about biological function (Fig 1) Ef- 
forts have been directed toward the amplification of each 
predicted ORF or any other region of the genome ranging 
from a few base pairs to several kilobase pairs. There are many 
uses for these amplicons— they can be cloned into standard 
vectors or specialized expression vectors, or can be cloned into 
other specialized vectors such as those used for two-hybrid 
analysis. The amplicons can also be used directly by for 
example, arraying onto glass for expression analysis, for DNA 
binding assays, or for any direct DNA assay (7). As a pilot 
study, synthetic primers were made on the 96-well automated 
multiplex oligonucleotide synthesizer (AMOS) instrument (8) 
(F«. 2). These oligonucleotides were used to amplify each 
ORF on yeast chromosome V. The current version of this 
instrument can synthesize three plates of 96 oligonucleotides 
CaCh /H£ aSCS) in ™ 8 " hr da - v - ^ amplification of the entire 
set of PCR products was then analyzed by gel electrophoresis 
(Fig 3). Successful amplification of the proper length product 
on the first attempt was 959c This project demonstrates that 
one can go directly from sequence information to biological 
analysis in a truly automated, totally directed manner. 

These amplicons can be incorporated directly in arrays or 
the amplicons can be cloned. If the amplicons are to be cloned, 
novel sequences can be incorporated at the 5' end of the 
oligonucleotide to facilitate cloning. One potential problem 
with cloning PCR products is that the cloned amplicons may 
contain sequence alterations that diminish their utility. One 
option would be to resequence each individual amplicon 
However, this is expensive, inefficient, and time consuming. A 
faster, more cost-effective, and more accurate approach is to 
apply comparative sequencing by denaturing HPLC (9). This 
method is capable of delecting a single base change in a 2-kb 
heteroduplex. Longer amplicons can be analyzed by use of 
appropriate restriction fragments. If anv change is detected in 
a clone, an alternate clone of the same region can be analyzed 
Modifying the system to allow high throughput analysis by 
denaturing HPLC is also relatively simple and straightforward. 

If amplicons are used directly on arrays without cloning it 
is important to note that, even if single PCR product bands are 
observed on gels, the PCR products will be contaminated with . 
various amounts of other sequences. This contamination has 
the potential to affect the results in. for example, expression 

T a£«V" addrcss: Svmcn '. Inc. 6519 Dumbarton Circle, Fremont, CA 
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Fig. 1. Overview of systematic method for isolating individual 
genes. Sequence information is obtained automatically from sequence 
databases. The data arc input into primer selection software specifi- 
cally designed to target ORFs as designated by database annotations. 
The output file containing the primer information is directly read by 
a high-throughput oligonucleotide synthesizer, which makes the oli- 
gonucleotides in 96-well plates (AMOS, automated multiplex oligo- 
nucleotide synthesizer). The forward and reverse primers are synthe- 
sized in the same location on separate plates to facilitate the down- 
stream handling of primers. The amplicons are generated by PCR in 
96-well plates as well. 

analysis. On the other hand, direct use of the amplicons is 
much less labor intensive and greatly decreases the occurrence 
of mistakes in clone identification, a ubiquitous problem 
associated with large clone set archiving and retrieving. 

Any large-scale effort to capture each ORF within a genome 
must rely on automation if cost is to be minimized while 
efficiency is maximized. Toward that end, primers targeting 
ORFs were designed automatically using simple new scripts 
and existing primer selection software. These script -selected 
primer sequences were directly read by the high-throuehput 
synthesizer and the forward and reverse primers were synthe- 
sized in separate plates in corresponding wells to facilitate 
automated pipetting and PCR amplifications. Each of the 
resulting PCR products, generated with minimum labor, con- 
tains a known, unique ORF. 

Large-scale genome analysis projects are dependent on 
newly emerging technologies to make the studies practical and 
economically feasible. For example, the cost of the primers, a 
significant issue in the past, has been reduced dramatically to 
make feasible this and other projects that require tens of 
thousands of oligonucleotides. Other methods of high- 
throughput analysis are also vital to the success of functional 
analysis projects, such as microarraving and oligonucleotide 
chip methods (10-14). 

Changes in attitude are also required. One of the major costs 
of commercial oligonucleotides is extensive quality control 
such that virtually 100% of the supplied oligonucleotides are 
successfully synthesized and work for their intended purpose 
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Fig. 2 Overall approach for using database of a genome to direct 
biological analysis. The synthesis of the 6.000 ORFs (orfs) for each 
gene of 5. cerextsiac can be used in many applications utilizing both 
cloning and microarraying technology. 

Considerable cost reduction can be obtained bv simply de- 
creasing the expected successful synthesis rate to 95-97%' One 
can then achieve faster and cheaper whole genome coverage by 
simply adding a single quality control at the end of the 
experiment and batching the failures for resvnthesis. 

The directed nature of the ampiicon approach is of clear 
advantage. The sequence of each ORF is analyzed automatic 
calry, and unique specific primers are made to target each 
ORF. Thus, there is relatively little time or labor involved— for 
example, no random cloning and subsequent screening is 
required because each product is known. In the test system 
primers for 240 ORFs from chromosome V were systematically 
synthesized, beginning from the left arm and continuing 
through to the right arm. At no point was there any manual 
analysis of sequence information to generate the collection In 
many ways, now that the sequence is known, there is no need 
for the researcher to examine it. 

These amplicons can be arrayed and expression analysis can 
be done on all arrayed ORFs with a single hybridization (10) 
Those ORFs that display significant differential expression 
patterns under a given selection are easilv identified without 
the laborious task of searching for and then sequencing a clone 
Once scaled up, the procedure provides even greater returns 
on effort, because a single hybridization will ultimately provide 
^snapshot" of the expression of all genes in the yeast genome. 
Thus, the limiting factor in whole genome analysis will not be 
the analysis process itself, but will instead be the ability of 
researchers to design and carry out experimental selections. 

Current expression and genetic analysis technologies are 
geared toward the analysis of single genes and are ill suited to 
analyze numerous genes under many conditions. Additional 
difficulties with current technologies include: the effort and 
expense required to analyze expression and make mutants, the 
potential duplication of effort if done by different laboratories, 
and the possibility of conflicting results obtained from differ- 
ent laboratories. In contrast, whole genome analysis not only 
is more efficient, it also provides data of much higher quality; 
all genes are assayed and compared in parallel under exactly 
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the same conditions, In addition, amplicons have manv appli- 
cations beyond gene expression. For example, one" recent 
approach is to incorporate a unique DNA sequence tae 
synthesized as pan of each gene specific primer, during 
amplification. The tags or molecular bar codes, when reintro- 
duced into the organism as a gene deletion or as a gene clone 
can be used much more efficiently than individual mutations 
or clones because pools of tagged mutants or transformants 
can be analyzed in parallel. This parallel analysis is possible 
because the tags are readily and quantitatively amplified even 
in complex mixtures of tags (13). 

These ORF genome arrays and oligonucleotide tagged 
libraries can be used for many applications. Anv conventional 
selection applied to a library that gives discrete or multiple 
products can use these technologies for a simple direct read- 
out. These include screens and selections for mutant comple- 
mentation, overexpression suppression (15. 16). second-site 
suppressors synthetic lethality, drug target overexpression 
(17). two-hybrid screens (18). genome mismatch scanning (19) 
or recombination mapping. 

The genome projects have provided researchers with a vast 
amount of information. These data must be used efficiently 
and systematically to gain a truly comprehensive understand- 
ing of gene function and. more broadly, of the entire genome 
which can then be applied to other organisms. Such global 
approaches are essential if we are to gain an understanding of 
the living cell. This understanding should come from the 
viewpoint of the integration of complex regulator networks 
the individual roles and interactions of thousands of functional 
gene products, and the effect of environmental changes on 
both gene regulatory networks and the roles of all eene 
products. The time has come to switch from the analysis^ a 
single gene to the analysis of the whole genome. 
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1 . An important feature of the work of many molecular biologists is identifying which 
genes are switched on and off in a cell under different environmental conditions or 
subsequent to xenobiotic challenge. Such information has many uses, including the 

. deciphering of molecular pathways and facilitating the development of new experimental 

and diagnostic procedures. However, the student of gene hunting should be forgiven for 
perhaps becoming confused by the mountain of information available as there appears to be 
almost as many methods of discovering dift erentiallv expressed genes as there are research 
groups using the technique. 

2. The aim of this review was to clarify the main methods of differential gene expression 
analysis and the mechanistic principles underlying them. Also included is a discussion on 
some of the practical aspects of using this technique. Emphasis is placed on the so-called 

open systems which require no prior knowledge of the genes contained within the study 
model. Whilst these will eventually be replaced bv * closed » systems in the study of human 
mouse and other commonly studied laboratory animals, they will remain a powerful tool for 
those examining less fashionable models. 

3 The use of suppress ion -PCR subtractive hvbridization Is exemplified in the 
identification of up- and down-regulated genes in rat liver following exposure to pheno- 
naroital, a well-known inducer of the drug metabolizing enzymes. 

4. Differential gene display provides a coherent platform for building libraries and 
microchip arrays of 'gene fingerprints' characteristic of known enzyme inducers and 
xenobiotic toxicants, which may be interrogated subsequently for the identification and 
characterization of xenobiotics of unknown biological properties. 

Introduction 

It is now apparent that the development of almost all cancers and many non- 
neoplastic diseases are accompanied by altered gene expression in the affected cells 
compared to their normal state (Hunter 1991, Wvnford-Thomas 1991, Vogelstein 
and Kinder 1993, Semenza 1994, Cassidy 1995, Kleinjan arid Van Hegningai 1998). 
Such changes also occur in response to external stimuli such as pathogenic micro- 
organisms (Rohn et al. 1996, Singh et al. 1997, Griffin and Krishna 1998, Lunney 
1998) and xenobiotics (Sewall et al 1995, Dogra et al. 1998, Ramana and Kohli 
1998), as well as during the development of undifferentiated cells (Hecht 1998, 
Rudin and Thompson 1998, Schneider-Maunoury et d. 1998). The potential 
medical and therapeutic benefits of understanding the molecular changes which 
occur in any given cell in progressing from the normal to the 'altered* state are 
enormous. Such profiling essentially provides a * fingerprint' of each step of a 
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cell's development or response and should help in the elucidation of specific and 
sensitive biomarkers representing, for example, different types of cancer or previous 
exposure to certain classes of chemicals that are enzyme inducers. 

In drug metabolism, many of the xenobiotic-metabolizing enzymes (including 
the well-characterized isoforms of cytochrome P450) are inducible by drugs and 
chemicals in man (Pelkonen et al 1998), predominantly involving transcriptional 
activation of not only the cognate cytochrome P450 genes, but additional cellular 
proteins which may be crucial to the phenomenal of induction. Accordingly, the 
development of methodology to identify and assess the full complement of genes 
that are either up- or down -regulated by inducers are crucial in the development of 
knowledge to understand the precise molecular mechanisms of enzyme induction 
and how this relates to drug action. Similarly, in the field of chemical-induced 
toxicity, it is now becoming increasingly obvious that most adverse reactions to 
drugs and chemicals are the result of multiple gene regulation, some of which are 
causal and some of which are casually -related to the toxico logical phenomenon per 
se. This observation has led to an upsurge in interest in gene-profiling technologies 
which differentiate between the control and toxin-treated gene pools in target tissues 
and is, therefore, of value in rationalizing the molecular mechanisms of xenobiotic- 
induced toxicity. Knowledge of toxin-dependent gene regulation in target tissues is 
not solely an academic pursuit as much interest has been generated in the 
pharmaceutical industry to harness this technology in the early identification of toxic 
drug candidates, thereby shortening the developmental process and contributing 
substantially to the safety assessment of new drugs. For example, if the gene profile 
in response to say a testicular toxin that has been well-characterized in vivo could be 
determined in the testis, then this profile would be representative of all new drug 
candidates which act via this specific molecular mechanism of toxicity, thereby 
providing a useful and coherent approach to the early detection of such toxicants. 
Whereas it would be informative to know the identity and functionality of all genes 
up/down regulated by such toxicants, this would appear a longer term goal, as the 
majority of human genes have not yet been sequenced, far less their functionality 
determined. However, the current use of gene profiling yields a pattern of gene 
changes for a xenobiotic of unknown toxicity which may be matched to that of well- 
characterized toxins, thus alerting the toxicologist to possible in vivo similarities 
between the unknown and the standard, thereby providing a platform for more 
extensive toxicological examination. Such approaches are beginning s to gain 
momentum, in that several biotechnology companies are commercially producing 
'gene chips' or 'gene arrays' that may be interrogated for toxicity assessment of 
xenobiotics. These chips consist of hundreds/thousands of genes, some of which are 
degenerate in the sense that not all of the genes are mechanistically-related to any 
one toxicological phenomenon. Whereas these chips are useful in broad-spectrum 
screening, they are maturing at a substantial rate, in that gene arrays are now 
becoming more specific, e.g. chips for the identification of changes in growth factor 
families that contribute to the aetiology and development of chemically-induced 
neoplasias. 

Although documenting and explaining these genetic changes presents a 
formidable obstacle to understanding the different mechanisms of development and 
disease progression, the technology is now available to begin attempting this difficult 
challenge. Indeed, several 'differential expression analysis' methods have been 
developed which facilitate the identification of gene products that demonstrate 
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altered expression in cells of one population compared to another. These methods 
have been used to identify differential gene expression in many situations, including 
invading pathogenic microbes (Zhao et al. 1998), in cells responding to extracellular 
and intracellular microbial invasion (Duguid and Dinauer 1990, Ragno et al 1997 
Maldarelli et al. 1998), in chemically treated cells (Syed et al. 1997, Rockett et al 
1999), neoplastic cells (Liang et al. 1992, Chang and Terzaghi-Howe 1998) 
activated cells (Gurskaya et al. 1996, Wan et al. 1996), differentiated cells (Hara et 
al. 1991, Guimaraes et al. 1995a, b), and different cell types (Davis et al 1984 
Hednck et al. 1984, Xhu et al. 1998). Although differential expression analysis 
technologies are applicable to a broad range of models, perhaps their most important 
advantage is that, in most cases, absolutely no prior knowledge of the specific genes 
which are up- or down-regulated is required. 

The field of differential expression analysis is a large and complex one, with 
many techniques available to the potential user. These can be categorized into 
several methodological approaches, including: 

(1) Differential screening, 

(2) Subtractive hybridization (SH) (includes methods such as chemical cross- 
linking subtraction— CCLS, suppression-PCR subtractive hybridization- 
SSH, and representational difference analysis— RDA), 

(3) Differential display (DD), 

(4) Restriction endonuclease facilitated analysis (including serial analysis of gene 
expression— SAGE — and gene expression fingerprinting— GEF), 

(5) Gene expression arrays, and 

(6) Expressed sequence tag (EST) analysis. 

The above approaches have been used successfully to isolate differentially 
expressed genes in different model systems. However, each method has its own 
subtle (and sometimes not so subtle) characteristics which incur various advantages 
and disadvantages. Accordingly, it is the purpose of this review to clarify the 
mechanistic principles underlying the main differential expression methods and to 
highlight some of the broader considerations and implications of this very powerful 
and increasingly popular technique. Specifically, we will concentrate on the so- 
called 'open' systems, namely those which do not require any knowledge of gene 
sequences and, therefore, are useful for isolating unknown genes. Two 'closed' 
systems (those utilising previously identified gene sequences), EST analysis and the 
use of DNA arrays, will also be considered briefly for completeness. Whilst 
emphasis will often be placed on suppression PCR subtractive hybridization (SSH 
the approach employed in this laboratory), it is the aim of the authors to highlight,' 
wherever possible, those areas of common interest to those who use, or intend to use,' 
differential gene expression analysis. 

Differential cDNA library screening (DS) 

Despite the development of multiple technological advances which have recently- 
brought the field of gene expression profiling to the forefront of molecular analysis, 
recognition of the importance of differential gene expression and characterization of 
differentially expressed genes has existed for many years. One of the original 
approaches used to identify such genes was described 20 years ago bv St John and 
Davis (1979). These authors developed a method, termed 'differential plaque filter 
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hybridization', which was used to isolate galactose-inducible DNA sequences from 
yeast. The theory is simple: a genomic DNA library is prepared from normal, 
unstimulated cells of the test organism/tissue and multiple filter replicas are 
prepared. These replica blots are probed with radioactively (or otherwise) labelled 
complex cDNA probes prepared from the control and test cell mRNA populations. 
Those mRNAs which are differentially expressed in the treated cell population will 
show a positive signal only on the filter probed with cDNA from the treated cells. 
Furthermore, labelled cDNA from different test conditions can be used to probe 
multiple blots, thereby enabling the identification of mRNAs which are only up- 
regulated under certain conditions. For example, St John and Davis (1979) screened 
replica filters with acetate-, glucose- and galactose -derived probes in order to obtain 
genes induced specifically by galactose metabolism. Although groundbreaking in its 
time this method is now. considered insensitive and time-consuming, as up to 2 
months are required to complete the identification of genes which are differentially 
expressed in the test population. In addition, there is no convenient way to check 
that the procedure has worked until the whole process has been completed. 

Subtractive Hybridization (SH) 

The developing concept of differential gene expression and the success of early 
approaches such as that described by St John and Davis (1979) soon gave rise to a 
search for more convenient methods of analysis. One of the first to be developed was 
SH, numerous variations of which have since been reported (see below). In general, 
this approach involves hybridization of mRNA /cDNA from one population (tester) 
to excess mRNA/cDNA from another (driver), followed by separation of the 
unhybridized tester fraction (differentially expressed) from the hybridized common 
sequences. This step has been achieved physically, chemically and through the use 
of selective polymerase chain reaction (PCR) techniques. 

Physical separation 

Original subtractive hybridization technology involved the physical separation 
of hybridized common species from unique single stranded species. Several methods 
of achieving this have been described, including hydroxyapatite chromatography 
(Sargent and Dawid 1983), avidin-biotin technology (Duguid and Dinauer 1990) 
and oligodT-latex separation (Hara et al. 1991). In the first approach, common 
mRNA species are removed by cDNA (from test cells)-mRNA (from control cells) 
subtractive hybridization followed by hydroxyapatite chromatography, as hydroxy- 
apatite specifically adsorbs the cDNA-mRNA hybrids. The unabsorbed cDNA is 
then used either for the construction of a cDNA library of differentially expressed 
genes (Sargent and Dawid 1983, Schneider et al. 1988) or directly as a probe to 
screen a preselected library (Zimmerman et al. 1980, Davis et al. 1984, Hedrick et al. 
1984). A schematic diagram of the procedure is shown in figure 1. 

Less rigorous physical separation procedures coupled with sensitivity enhancing 
PCR steps were later developed as a means to overcome some of the problems 
encountered with the hydroxyapatite procedure. For example, Daguid and Dinauer 
(1990) described a method of subtraction utilizing biotin -affinity systems as a means 
to remove hybridized common sequences. In this process, both the control and 
tester mRNA populations are first converted to cDN A and an adaptor (' oligovector \ 
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Figure 1. The hydroxyapatite method of subtractivc hybridization. cDNA derived from the 
treated /altered (tester) population is mixed with a large excess of mRNA from the control (driver) 
population. Following hybridization, mRNA-cDNA hybrids are removed by hydroxyapatite 
chromatography. The only cDNAs which remain are those which are differentially expressed in 
the treated/altered population. In order to facilitate the recovery of full length clones, small cDN A 
fragments are removed by exclusion chromatography. The remaining cDNAs are then cloned into 
a vector for sequencing, or labelled and used dircctlv to probe a librarv, as described bv Sareent 
and Dawid (1983). * 



containing a restriction site) ligated to both sides. Both populations are then 
amplified by PCR, but the driver cDNA population is subsequently digested with 
the adaptor-containing restriction endonuclease. This serves to cleave the oligo- 
vector and reduce the amplification potential of the control population. The digested 
control population is then biotinylated and an excess mixed with tester cDNA. 
Following denaturation and hybridization, the mix is applied to a biocytin column 
(streptavidin may also be used) to remove the control population, including 
heteroduplexes formed by annealing of common sequences from the tester 
population. The procedure is repeated several times following the addition of fresh 
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Figure 2. The use of oligodT^, latex to perform subtractive hybridization. mRNA extracted from the 
control (driver) population is converted to anchored cDNA using polydT oligonucleotides 
attached to latex beads. mRNA from the treated /altered (tester) population is repeatedly 
hybridized against an excess of the anchored driver cDNA. The final population of mRNA is 
tester specific and can be converted intocDNA for cloning and other downstream applications, as 
described by Hara et al. (1991). 
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control cDNA. In order to further enrich those species differentially expressed in 
the tester cDNA, the subtracted tester population is amplified by PCR following 
every second subtraction cycle. After six cycles of subtraction (three reamplification 
steps) the reaction mix is ligated into a vector for further analysis. 

In a slightly different approach, Hara et al. (1991) utilized a method whereby 
oligo(dT 30 ) primers attached to a latex substrate are used to first capture mRNA 
extracted from the control population. Following 1st strand cDNA synthesis, the 
RNA strand of the heteroduplexes is removed by heat denaturation and centri- 
fugation (the cDNA-oligotex-dT^ forms a pellet and the supernatant is removed). 
A quantity of tester mRNA is then repeatedly hybridized to the immobilized control 
(driver) cDNA (which is present in 20-fold excess). After several rounds of 
hybridization the only mRNA molecules left in the tester mRNA population are 
those which are not found in the driver cDNA-oligotex-dT^ population. These 
tester-specific mRNA species are then converted to cDNA and, following the 
addition of adaptor sequences, amplified by PCR. The PCR products are then 
ligated into a vector for further analysis using restriction sites incorporated into the 
PCR primers. A schematic illustration of this subtraction process is shown in figure 

However, all these methods utilising physical separation have been described as 
inefficient due to the requirement for large starting amounts of mRNA, significant 
loss of material during the separation process and a need for several rounds of 
hybridization. Hence, new methods of differential expression analysis have recently 
been designed to eliminate these problems. 



Chemical Cross-Linking Subtraction (CCLS ) 

In this technique, originally described by Hampson et al. (1992), driver mRNA 
is mixed with tester cDNA (1st strand only) in a ratio of > 20:1. The common 
sequences form cDNA:mRNA hybrids, leaving the tester specific species as single 
stranded cDNA. Instead of physically separating these hybrids, they are inactivated 
chemically using 2,5 diaziridinyl-1 ,4-benzoquinone (DZQ). Labelled probes are 
then synthesized from the remaining single stranded cDNA species (unreacted 
mRNA species remaining from the driver are not converted into probe material due 
to specificity of Sequenase T7 DNA polymerase used to make the probe) and used 
to screen a cDNA library made from the tester cell population. A schematic diagram 
of the system is shown in figure 3. 

It has been shown that the differentially expressed sequences can be enriched at 
least 300-fold with one round of subtraction (Hampson et al. 1992), and that the 
technique should allow isolation of cDN As derived from transcripts that are present 
at less than 50 copies per cell. This equates to genes at the low end of intermediate 
abundance (see table 1). The main advantages of the CCLS approach are that it is 
rapid, technically simple and also produces fewer false positives than other 
differential expression analysis methods. However, like the physical separation 
protocols, a major drawback with CCLS is the large amount of starting material 
required (at least 10 pig RNA). Consequently, the technique has recently been 
refined so that a renewable source of RNA can be generated. The degenerate random 
oligonucleotide primed (DROP) adaptation (Hampson et al. 1996, Hampson and 
Hampson 1997) uses random hexanucleotide sequences to prime solid phase- 
synthesized cDNA. Since each primer includes a T7 polymerase promotor sequence 
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Figure 3. Chemical cross-linking subtraction. Excess driver mRNA is mixed with 1* strand tester 
C P^A. The cornmori sequences form mRNA:cDNA hybrids which are cross linked with 2,5 
diaziridinyl-l,4-benzoquinone (DZQ) and the remaining cDNA sequences are differentially 
expressed in the tester population. Probes arc made from these sequences using Sequenase 2.6 
DNA polymerase, which lacks reverse transcriptase activity and, therefore, does not react with the 
remaining mRNA molecules from the driver. The labelled probes are then used to screen a cDNA 
library for clones of differentially expressed sequences. Adapted from Walter et at. (1996), with 
permission. 



Table 1, The abundance of mRNA species and classes in a typical mammalian cell. 



mRNA 
class 


Copies of 

each 
species /cell 


No. of mRNA 
species in 
class 


Mean % of 
each species 
in class 


Mean mass 
(ng) of each 
species //jg 
total RNA 


Abundant 


12000 


4 


3.3 


1.65 


Intermediate 


300 


500 


0.08 


0.04 


Rare 


15 


11000 


0.004 


0.002 



Modified from Bertioli et aL (1995). 
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at the 5' end, the final pool of random cDNA fragments is a PCR-renewable cDNA 
population which is representative of the expressed gene pool and can be used to 
synthesize sense RNA for use as driver material. Furthermore, if the final pool of 
random cDNA fragments is reamplified using biotinylated T7 primer and random 
hexamer, the product can be captured with streptavidin beads and the antisense 
strand eluted for use as tester. Since both target and driver can be generated from 
the same DROP product, subtraction can be performed in both directions (i.e. for 
up- and down-regulated species) between two different DROP products. 

Representational Difference Analysis (RDA) 

RDA of cDNA (Hubank and Schatz 1994) is an extension of the technique 
originally applied to genomic DNA as a means of identifying differences between 
two complex genomes (Lisitsyn et al. 1993). It is a process of subtraction and 
amplification involving subtractive hybridization of the tester in the presence of 
excess driver. Sequences in the tester that have homologues in the driver are 
rendered unamplifiable, whereas those genes expressed only in the tester retain the 
ability to be amplified by PCR. The procedure is shown schematically in figure 4. 

In essence, the driver and tester mRNA populations are first converted to cDNA 
and amplified by PCR following the ligation of an adaptor. The adaptors are then 
removed from both populations and a new (different) adaptor ligated to the 
amplified tester population only. Driver and tester populations are next melted and 
hybridized together in a ratio of 100:1. Following hybridization, only tester: tester 
homohybrids have 5 'adaptors at each end of the DNA duplex and can, thus, be filled 
in at both V ends. Hence, only these molecules are amplified exponentially during 
the subsequent PCR step. Although tester: driver heterohybrids are present, they 
only amplify in a linear fashion, since the strand derived from the driver has no 
adaptor to which the primer can bind. Driver: driver heterohybrids have no 
adaptors and, therefore, are not amplified. Single stranded molecules are digested 
with mung bean nuclease before a further PCR-enrichment of the tester : tester 
homohybrids. The adaptors on the amplified tester population are then replaced and 
the whole process repeated a further two or three times using an increasing excess of 
driver (Hubank and Shatz used a tester .-driver ratio of 1:400, 1:80000 and 
1:800000 for the second, third and fourth hybridizations, respectively). Different 
adaptors are ligated to the tester between successive rounds of hybridization and 
amplification to prevent the accumulation of PCR products that might interfere with 
subsequent amplifications. The final display is a series of differentially expressed 
gene products easily observable on an ethidium bromide gel. 

The main advantages of RDA are that it offers a reproducible and sensitive 
approach to the analysis of differentially expressed genes. Hubank and Schatz (1 994) 
reported that they were able to isolate genes that were differentially expressed in 
substantially less than 1 % of the cells from which the tester is derived. Perhaps the 
main drawback is that multiple rounds of ligation, hybridization, amplifiation and 
digestion are required. The procedure is, therefore, lengthier than many other 
differential display approaches and provides more opportunity for operator-induced 
error to occur. Although the generation of false positives has been noted, this has 
been solved to some degree by O'Neill and Sinclair (1997) through the use of HPLC- 
purified adaptors. These are free of the truncated adaptors which appear to be a 
major source of the false positive bands. A very similar technique to RDA, termed 
linker capture subtraction (LCS) was described by Yang and Sytowski (1996). 
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Figure 4. The representational difference analysis (R DA) technique. Driver and tester cDNA are 
digested with a 4-cutter restriction enzyme such as infill. The l sl set of 12/24 adaptor strands 
(oligonucleotides) are ligated to each other and the digested cDNA products. The 12mer is 
subsequently melted away and the 3'ends tilled in using Taq DNA polymerase. Each cDNA 
population is then amplified using PCR, following which the 1" set of adaptors is removed with 
DpnU. A second set of 12/24 adaptor strands is then added to the amplified tester cDNA 
population, after which the tester is hybridized against a large excess of driver. The 12mer 
adaptors arc melted and the 3' ends filled in as before. PCR is carried out with primers identical 
to the new 24mer adaptor. Thus, the only hybridization products which are exponentially 
amplified are those which are tester: tester combinations. Following PCR, ssDNA products are 
removed with mung bean nuclease, leaving the 'first difference product*. This is digested and a 
third set of 12/24 adaptors added before repeating the subtraction process from the hybridization 
stage. The process is repeated to the 3 rd or 4 th difference product, as described by Lisitsyn et ai 
(1 993) and H ubank and Schatz (1 994). 
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Suppression PCR Subtractive Hybridization (SSH) 

The most recent adaptation of the SH approach to differential expression 
analysis was first described by Diatchenko et al. (1996) and Gurskaya et al (1996) 
They reported that a 1000-5000 fold enrichment of rare cDNAs (equivalent to 
isolating mRNAs present at only a few copies per cell) can be obtained without the 
need for multiple hybridizations/subtractions. Instead of physical or chemical 
figurel) COmm ° n SeqUenCeS> 3 PCR - bas ^ suppression system is used (see 

In SSH excess driver cDNA is added to two portions of the tester cDNA which 
have been hgated with different adaptors. A first round of hybridization serves to 
enrich differentially expressed genes and equalize rare and abundant messages. 
Equalization occurs since reannealing is more rapid for abundant molecules than for 

m °! CCUleS duC t0 the second order kinetics of hybridization (James and Higgins 
1985). The two primary hybridization mixes are then mixed together in the presence 
of excess driver and allowed to hybridize further. This step permits the annealing of 
single stranded complementary sequences which did not hybridize in the primary 
hybridization, and m doing so generates templates for PCR amplification. Although 
there are several possible combinations of the single stranded molecules present in 
the secondary hybridization mix, only one particular combination (differentially 
expressed in the tester cDNA composed of complimentary strands having different 
adaptors) can amplify exponentially. 

Ha ving obtained the final differential display, two options are available if cloning 
of cDNAs ,s desired. One is to transform the whole of the final PCR reaction into 
competent cells. Transformed colonies can then be isolated and their inserts 
characterized by sequencing, restriction analysis or PCR. Alternatively, the final 
PCR products can be resolved on a gel and the individual bands excised, reamplified 
and cloned The first approach is technically simpler and less time consuming 
However, ligation/transformation reactions are known to be biased towards the 
cloning of smaller molecules, and so the final population of clones will probably not 
contain a representative selection of the larger products. In addition, although 
equalization theoretically occurs, observations in this laboratory suggest that this is 
by no means perfectly accomplished. Consequently, some gene species are present 
in a higher number than others and this will be represented in the final population 
of clones Thus, in order to obtain a substantial proportion of those gene species that 
actually demonstrate differential expression in the tester population, the number of 
clones that will have to be screened after this step may be substantial; The second 
approach is initially more time consuming and technically demanding. However it 
would appear to offer better prospects for cloning larger and low abundance gel 
products. In addition, one can incorporate a screening step that differentiates 
different products of different sequences but of the same size (HA-staining see 
later). In this way, a good idea of the final number of clones to be isolated'and 
identified can be achieved. 

An alternative (or even complementary) approach is to use the final differential 
display, reaction to screen a cDNA library to isolate full length clones for further 
characterization, or a DNA array (see later) to quickly identify known genes. SSH 
has been used in this laboratory to begin characterization of the short-term gene 
expression profiles of enzyme-inducers such as phenobarbital (Rockett et al 1997) 
and Wy-14,643 (Rockett et al. unpublished observations). The isolation of 
differentially expressed genes in this manner enables the construction of a fingerprint 
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Tester cDMA with adaptor 1 



Driver cDNA 
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Tester cDNA with adaptor 2 
i r//i. 




Mix samples, add fresh denatured driver, anneal 



a,b, c,d & e 



T 



I 



Fill in ends 




V77L 



I fcWT 



V// I 



222T 



Add primers and 
amplify by PCR 




a, d no amplification 

b no amplification -suppressed due to 
formation of panhandle structure 

c linear amplification 

e exponential amplification 

Figure 5. PCR-select cDNA subtraction. In the primary hybridization, an excess of driver cDNA is 
added to each tester cDN A population. The samples arc heat denatured and allowed to hybridize 
for between 3 and 8 h. This serves two purposes: (1) to equalize rare and abundant molecules; and 
(2) to enrich for differentially expressed sequences — cDNAs that arc not differentially expressed 
form type c molecules with the driver. In the secondary hybridization, the two primary 
hybridizations are mixed together without denaturing. Fresh denatured driver can also be added 
at this point to allow further enrichment of differentially expressed sequences. Type c molecules 
are formed in this secondary hybridization which arc subsequently amplified using two rounds of 
PCR. The final products can be visualized on an agarose gel, labelled directly or cloned into a 
vector for downstream manipulation. As. described by Diatchenko et. al. (1996) and Gurskaya 
et al. (1996), with permission. 
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Figure 6. Flow diagram showing method used in this laboratory to isolate and identify clones of genes 
which are differentially expressed in rat liver following short term exposure to the enzyme 
inducers, phenobarbital and Wy-1 4,643. 

of expressed genes which are unique to each compound and time /dose point. Such 
information could be useful in short-term characterization of the toxic potential of 
new compounds by comparing the gene-expression profiles they elicit with those 
producedlby known inducers. Figure 6 shows a flow diagram of the method used to 
isolate, verify and clone differentially expressed genes, and figure 7 shows expression 
profiles obtained from a typical SSH experiment. Subsequent sub-cloning of the 
individual bands, sequencing and gene data base interrogation reveals many genes 
which are either up- or down-regulated by phenobarbital in the rat (tables 2 and 3). 

One of the advantages in using the SSH approach is that no prior knowledge is 
required of which specific genes are up/down-regulated subsequent to xenobiotic 
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Figure 7. SSH display patterns obtained from rat liver following 3-day treatment with WY-14,643 or 
phenobarbital. mRNA extracted from control and treated livers was used to generate the 
differential displays using the PCR-Select cDNA subtraction kit (Clontech). Lane: 1—1 kb 
ladder; 2— genes upregulated following Wy,l 4-643 treatment; 3— genes downregulated following 
Wy, 14-643 treatment; 4 — genes upregulated following phenobarbital treatment; 5 — genes 
downregulated following phenobarbital treatment; 6— lkb ladder. Reproduced from Rockett et 
al. (1997), with permission. 

exposure, and an almost complete complement of genes are obtained. For example, 
the peroxisome proliferator and non-genotoxic hepatocarcinogen Wy,14,643, up- 
regulates at least 28 genes and down-regulates at least 15 in the rat (a sensitive 
species) and produces 48 up- and 37 down-regulated genes in the guinea pig, a 
resistant species (Rockett, Swales, Esda and Gibson, unpublished observations). 
One of these genes, CD81, was up-regulated in the rat and down-regulated in the 
guinea pig following Wy-14,643 treatment. CD81 (alternatively named TAPA-1) is 
a widely expressed cell surface protein which is involved in a large number of cellular 
processes including adhesion, activation, proliferation and differentiation (Levy et 
al. 1998). Since all of these functions are altered to some extent in the phenomena 
of hepatomegaly and non-genotoxic hepatocarcinogenesis, it is intriguing, and 
probably mechanistically-relevant, that CD81 expression is differentially regulated 
in a resistant and susceptible species. However, the down-side of this approach is 
that the majority of genes can be sequenced and matched to database sequences, but 
the latter are predominantly expressed sequence tags or genes of completely 
unknown function, thus partially obscuring a realistic overall assessment of the 
critical genes of genuine biological interest. Notwithstanding the lack of complete 
funtional identification of altered gene expression, such gene profiling studies 
essentially provides a 'molecular fingerprint' in response to xenobiotic challenge, 
thereby serving as a mechanistically-relevant platform for further detailed 
investigations. 

Differential Display (DD) 

Originally described as 'RNA fingerprinting by arbitrarily primed PCR' (Liang 
and Pardee 1992) this method is now more commonly referred to as 'differential 
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Table 2. Genes up.regulated in rat liver following 3-day exposure to phenobarbital. 



Band number 
(approximate 
size in bp) 



Highest sequence 
similarity 



FASTA-EMBL gene identification 



5 (1300) 

7 (1000) 

8 (950) 
10(850) 

11 (800) 

12 (750) 
15 (600) 
16(55) 
21 (350) 



93.5% 
95.1% 

98.3% 
95.7% 
Clone 1 94.9% 
Clone 2 75.3% 
93.8%. 

92.9% 

Clone 1 95.2% 
Clone 2 93.6% 
99.3% 



CYP2B1 

Preproalbumin 

Serum albumin mRNA 

N'CI-CGAP-Prl H. sapiens (EST) 

CYP2B1 

CYP2B1 

CYP2B2 

TRPM-2 mRNA 

Sulfated glycoprotein 

Preproalbumin 

Serum albumin mRNA 

CYP2B1 

Haptoglobulin mRNA partial alpha 
18S, 5.8S&28SrRNa 



Bands 1-4, 6 9, 13, 1,4 and 17-20 are shown to be false positives by dot blot anavlsis and, therefore 
are not sequenced. Derived from Rockett et al. (1 997). 1 1 should be noted that the above genes do not. 
represent the complete spectrum of genes which are up-regulated in rat liver by phenobarbital but 
simply represents the genes sequenced and identified to date. 



Table 3. 



Genes down -regulated in rat liver following 3-day exposure to phenobarbital. 



Band number 
(approximate 
size in bp) 



Highest sequence 
similarity 



FASTA-EMBL gene identification 



1 (1500) 

2 (1200) 

3 (1000) 
7 (700) 



8 (650) 

9 (600) 

10(550) 

11 (525) 

12 (375) 

13 (23) 



14(170) 
15 (140) 
Others: (300) 
(275) 



Clone 1 
Clone 2 
Clone 3 
Clone 1 
Clone 2 
Clone 1 
Clone 2 



Clone 1 
Clone 2 
Clone 3 



95.3% 
92.3% 
91.7% 
77.2% 
94.5% 
91.0% 
86.9% 
96.2% 
86.9% 
82.0% 
73.8% 
95.7% 
100.0% 
97.2% 
100.0% 
100.0% 
96.0% 
97.3% 
96.7% 
93.1% 



3-oxoacyl-CoA thiolase 
Hemopoxin mRNA 
Aipha-2u-globulin mRNA 
M. musculus CI inhibitor 
Electron transfer flavoprotein 
M. musculus Topoisomerase 1 (Topo 1) 
Soares 2NbMT M. musculus (EST) 
Alpha-2u-globulin (s-type) mRNA 
Soares mouse NML M. musculus (EST) 
Soares p3NMF 19.5 XL musculus (EST) 
Soares mouse NML A/, musculus (EST) 
NCl-CGAP-Prl H. sapiens (EST) 
Ribosomal protein 

Soares mouse embryo NbME135 (EST) 

Fibrinogen B-bcta-chain 

Apolipoprotein E gene 

Scares p3NMF19.5 M. musculus (EST) 

Stratagene mouse testis (EST) 

R . mm vgicus RASP 1 mRNA 

Soares mouse mammary gland (EST) 



ES r - Expressed sequence tag. Bands 4-6 were shown to be false positives by dot blot analysis and 
therefore, were not sequenced. Derived from Rockett et aL (1 997), It should be noted that the above genes 
do not represent the complete spectrum of genes which are down-regulated in rat liver by phenobarbital 
but sirmply represents the genes sequenced and identified to date. 



display; (DD). In this method, all the mRNA species in the control and treated cell 
populations are amplified in separate reactions using reverse transcriptase-PCR 
(RT-PCR). The products are then run side-by-side on sequencing gels. Those 
bands which are present in one display only, or which are much more intense in one 
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display compared to the other, are differentially expressed and may be recovered for 
further characterization. One advantage of this system is the speed with which it can 
be carried out— 2 days to obtain a display and as little as a week to make and identify 
clones. 

Two commonly used variations are based on different methods of priming the 
reverse transcription step (figure 8). One is to use an oligo dT with a 2-base 'anchor' 
at the 3'-end, e.g. 5' (dT u )CA V (Liang and Pardee 1992). Alternatively, an 
arbitrary primer may be used for 1st strand cDNA synthesis (Welsh et al. 1992). 
This variant of RNA fingerprinting has also been called *RAP* (RNA Arbitrarily 
Primed)-PCR. One advantage of this second approach is that PCR products may be 
derived from anywhere in the RNA, including open reading frames. In addition, it 
can be used for mRN As that are not polyadenylated, such as many bacterial mRNAs 
(Wong and McClelland 1994). In both cases, following reverse transcription and 
denaturation, second strand cDNA synthesis is carried out with an arbitrary primer 
(arbitrary primers have a single base at each position, as compared to random 
primers, which contain a mixture of all four bases at each position). The resulting 
PGR, thus, produces a series of products which, depending on the system (primer 
length and composition, polymerase and gel system), usually includes 50-100 
products per primer set (Band and Sager 1989). When a combination of different 
dT-anchors and arbitrary primers are used, almost all mRNA species from a cell can 
be amplified. When the cDNA products from two different populations are analysed 
side by side on a polyacrylamide gel, differences in expression can be identified and 
the appropriate bands recovered for cloning and further analysis. 

Although DD is perhaps the most popular approach used today for identifying 
differentially expressed genes, it does suffer from several perceived disadvantages: 

(1) It may have a strong bias towards high copy number mRNAs (Bertioli et al. 
1 995), although this has been disputed (Wan etal. 1 996) and the isolation of very 
low abundance genes may be achieved in certain circumstances (Guimeraes et 
al. 1995a). 

(2) The cDNAs obtained often only represent the extreme 3' end of the mRNA 
(often the 3 '-untranslated region), although this may not always be the case 
(Guimeraes <?* al. 1995a). Since the 3'end is often not included in Genbank and 
shows variation between organisms, cDNAs identified by DD cannot always be 
matched with their genes, even if they have been identified. 

(3) The pattern of differential expression seen on the display often cannot be 
reproduced on Northern blots, with false positives arising in up to 70% of cases 
(Sun et al. 1994). Some adaptations have been shown to reduce false positives, 
including the use of two reverse transcriptases (Sung and Denman 1997), 
comparison of uninduced and induced cells over a time course (Burn et al. 1994) 
and comparison of DDPCR-products from two uninduced and two induced 
lines (Sompayrac et al. 1995). The latter authors also reported that the use of 
cytoplasmic RNA rather then total RNA reduces false positives arising from 
nuclear RNA that is not transported to the cytoplasm. 

Further details of the background, strengths and weaknesses of the DD 
technique can be obtained from a review by McClelland et al. (1996) and from 
articles by Liang et al. (1995). and Wan et al. (1996). 



Differential gene expression 



671 



mRNA 



(dTn)CA: AC 




■AAAAAAAA 

Arbitrary primer: 



1 st strand cDNA 1 s * strand cDNA 
< AC_ + 



•UGAAAAAAA AAAAAAA 

Denature and synthesise 2 nd strand I 
with any arbitrary primer ( ) ^ . 



, 2 nd strand cDNA^, 2 nd strand cDNA 

< , AC ' i * 

1 . I 



cDNA can now be amplified by PCR using original primer pair 

Figure 8. Two approaches to differential display (DD) analysis. 1* strand synthesis can be carried out 
either with a polydT,, NN primer (where N = G, C or A) or with an arbitrary primer. The use of 
difl erent combinations of G , C and A to anchor the first strand polvdT primer enables the priming 
of the majority of polyadenylated mRNAs. Arbitrary primers may hybridize at none, one or more 
places along the length of the mRNA, allowing 1* strand cDNA synthesis to occur at none one 
or more points in the same gene. In both cases, 2 nd strand synthesis is carried out with an arbitrary 
primer. Since these arbitrary primers for the 2 nJ strand may also hybridize to the 1* strand cDN A 
m a number ot diff erent places, several different 2 nd strand products may be obtained from one 
binding point of the 1* strand primer. Following 2 nd strand synthesis, the original set of primers 
is used to amplify the second strand products, with the result that numerous gene sequences are 
amplified. 



Restriction endonuclease-facilitated analysis of gene expression 
Serial Analysis of Gene Expression (SAGE) 

A more recent development in the field of differential display is SAGE analysis 
(Velculescu etal 1995). This method uses a different approach to those discussed so 
far and is based on two principles. Firstly, in more than 95% of cases, short 
nucleotide sequences ('tags*) of only nine or 10 base pairs provide sufficient 
information to identify their gene of origin. Secondly, concatonation (linking 
together in a series) of these tags allows sequencing of multiple cDNAs within a 
single clone. Figure 9 shows a schematic representation of the SA GE process. In this 
procedure, double stranded cDNA from the test cells is synthesized with a 
biot'inylated polydT primer. Following digestion with a commonly cutting (4bp 
recognition sequence) restriction enzyme ('anchoring enzyme 1 ), the 3' ends of the 
cDNA population are captured with streptavidin beads. The captured population is 
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split into two and different adaptors ligated to the 5 'ends of each group. Incorporated 
into the adaptors is a recognition sequence for a type IIS restriction enzyme — one 
which cuts DNA at a defined distance (< 20 bp) from its recognition sequence. 
Hence, following digestion of each captured cDNA population with the IIS enzyme, 
the adaptors plus a short piece of the captured cDNA are released. The two 
populations are then ligated and the products amplified. The amplified products are 
cleaved with the original anchoring enzyme, religated (concatomers are formed in 
the process) and cloned. The advantage of this system is that hundreds of gene tags 
can be identified by sequencing only a few clones. Furthermore, the number of times 
a given transcript is identified is a quantitative measurement of that gene's 
abundance in the original population, a feature which facilitates identification of 
differentially expressed genes in different cell populations. 

Some disadvantages of SAGE analysis include the technical difficulty of the 
method , a large amount of accurate sequencing is required, biased towards abundant 
mRNAs, has not been validated in the pharmaco/toxicogenornic setting and has 
only been used to examine well known tissue differences to date. 

Gene Expression Fingerprinting (GEF) 

A different capture/restriction digest approach for isolating differentially 
expressed genes has been described by Ivanova and Belyavsky (1995). In this 
method, RNA is converted to cDNA using biotinylated oligo(dT) primers. The 
cDNA population is then digested with a specific endonuclease and captured with 
magnetic streptavidin microbeads to facilitate removal of the unwanted 5 'digestion 
products. The use of restricted 3 '-ends alone serves to reduce the complexity of the 
cDNA fragment pool and helps to ensure that each RNA species is represented by 
not more than one restriction product. An adaptor is ligated to facilitate subsequent 
amplification of the captured population. PCR is carried out with one adaptor- 
specific and one biotinylated polydT primer. The reamplified population is 
recaptured and the non-biotinylated strands removed by alkaline dissociation. The 
non-biotinylated strand is then resynthesized using a different adaptor-specific 
primer in the presence of a radiolabeled dNTP. The labelled immobilized 3'cDNA 
ends are next sequentially treated with a series of different restriction endonucleases 
and the products from each digestion analysed by PAGE. The result is a fingerprint 
composed of a number of ladders (equal to the number of sequential digests used). 
By comparing test versus control fingerprints, it is possible to identify differentially 
expressed products which can then be isolated from the gel and cloned. The 
advantages of this procedure are that it is very robust and reproducible, and the 
authors estimate that 80-93% of cDNA molecules are involved in the final 
fingerprint. The disadvantage is that polyacrylamide gels can rarely resolve more 
than 300-400 bands, which compares poorly to the 1000 or more which are 
estimated to be produced in an average experiment. The use of 2-D gels such as 
those described by Uitterlinden et al. (1989) and Hatada et aL (1991) may help to 
overcome this problem. 

A similar method for displaying restriction endonuclease fragments was later 
described by Prashar and Weissman (1996). However, instead of sequential 
digestion of the immobolized S'-terminal cDNA fragments, these authors simply 
compared the profiles of the control and treated populations without further 
manipulation. 
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1 




CATG 
GTAC 



AAAA 



CATG 
GTAC- 




Cleave with tagging enzyme (TE) 
and produce blunt ends 



GGATGCATGXXXXXXXXX 
CCTACGTACXXXXXXXXX 



GGATGCATGOOOOOOOOO 
2 CCTACGTACOOOOOOOOO 



TE AE 



Tag 



TE AE 



Tag 



| Ligate and amplify 



(^TGCATGXXXXXXXXXOXXXX)OOOCATGCATCC 
CCTACGTACXXXXXXXXXOOOOOOOOOGTACGTAGG 



;AE 



CNTag 



AE 



Cleave with AE, isolate diTags, 
concatenate, done and 
sequence 

AE 



— C^TGXXXXXXXXXOOOOOOOOOCATG XXXXXXXXXOOOOOOOOOCATG 

— GTACXXXXXXXXXOOOOOOOOOGTAC XXXXXXXXXOOOOOOOOOGTAC— 



Tag 1 Tag 2 



Tag 3 Tag 4 



8UrC / a ^ nal 1 a " a1 yf ,s of S ene expression (SAGE) analysis. cDN A is cleaved with an anchoring enzyme 
(Afc)and the 3 ends captured using streptavidin beads. ThecONA pool is divided in half and each 
port,on ligated to a dflerent linker, each containing a type IIS restriction site (tagging enzyme, 

/YYVYr?^ 6 £' PC MS CnZyme rcleaSCS thc linkcr P ,us a short Icn Kth of cDNA 
AAXXX and OOOOO indicate nucleotides of different tags). Thc two pools of tags arc then 
ligated and amplified using linker-specific primers. Following PCK, the products arc cleaved with 
the AE-and the ditags isolated from the linkers using PAGE. Thc ditags arc then ligated (during 
which process, concatenization occurs) and cloned into a vector of choice for sequencing After 
Velculescu ^ a/. (1995), with permission. 
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DNA arrays 

'Open ' differential display systems are cumbersome in that it takes a great deal 
of time to extract and identify candidate genes and then confirm that they are indeed 
up- or down-regulated in the treated compared to the control tissue. Normally the 
latter process is carried out using Northern blotting or RT-PCR Even so each of 
the aforementioned steps produce a bottleneck to the ultimate goal of rapid'analysis 
of gene expression. These problems will likely be addressed by the development of 
so-called DNA arrays (e.g. Gress et al. 1992, Zhao et al. 1995, Schena etal 1996) 
the introduction of which has signalled the next era in differential gene expression 
analysis DNA arrays consist of a gridded membrane or glass 'chips' containing 
hundreds or thousands of DNA spots, each consisting of multiple copies of part of 
a known gene. The genes are often selected based on previously proven involvement 
in oncogenesis cell cycling, DNA repair, development and other cellular processes 
They are usually chosen to be as specific as possible for each gene and animal species 
Human and mouse arrays are already commercially available and a few companies 
will construct a personalized array to order, for example Clontech Laboratories and 
Research Genetics Inc. The technique is rapid in that hundreds or even thousands 
of genes can be spotted on a single array, and that mRNA/cDNA from the test 
populations can be labelled and used directly as probe. When analysed with 
appropriate hardware and software, arrays offer a rapid and quantitative means to 
assess differences in gene expression between two cell populations. Of course there 
can only be identification and quantitation of those genes which are in the' array 
(hence the term 'closed* system); Therefore, one approach to elucidating the 
molecular mechanisms involved in a particular disease/development system may be 
to combine- an open and closed system-a DNA array to directly identify and 
quantitate the expression of known genes in mRNA populations, and an open 
system such as SSH to isolate unknown genes which are differentially expressed 

One of the main advantages of DNA arrays is the huge number of gene fragments 
IrwEn PUt °" membrane -«>™ companies have reported gridding up to 

60000 spots on a single glass 'chip ' (microscope slide). These high density chip- 
based micro-arrays will probably become available as mass-produced off-the-shelf 
items in the near future. This should facilitate the more rapid determination of 
differential expression in time and dose-response experiments. Aside from their 
high cost and the technical complexities involved in producing and probing DNA 
arrays, the main problem which remains, especially with the newer micro-array 
(gene-chip) technologies, is that results are often not wholly reproducible between 
arrays. However, this problem is being addressed and should be resolved within the 
next few years. 



EST databases as a means to identify differentially expressed genes 

Expressed sequence tags (ESTs) are partial sequences of clones obtained from 
cDNA libraries. Even though most ESTs have no formal identity (putative 
identification is the best to be hoped for), they have proven to be a rapid and efficient 
means of discovering new genes and can be used to generate profiles of gene- 
expression in specific cells. Since they were first described by Adams et al (1991) 
there has been a huge explosion in EST production and it is estimated that there are 
now well over a million such sequences in the public domain, representing over half 
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of all human genes (Hillier et al. 1996). This large number of freely available 
sequences (both sequence information and clones are normally available royalty-free 
from the originators) has enabled the development of a new approach towards 
differential gene expression analysis as described by Vasmatzis et al. (1998) The 
approach is simple in theory: EST databases are first searched for genes that have a 
number of related EST sequences from the target tissue of choice, but none or few 
from non-target tissue libraries. Programmes to assist in the assembly of such sets of 
overlapping data may be developed in-house or obtained privately or from the 
internet. For example, the Institute for Genomic Research (TIGR, found at 
http:/ /www.tigr.org) provides many software tools free of charge to the scientific 
community. Included amongst these is the TIGR assembler (Sutton et al. 1995) a 
tool for the assembly of large sets of overlapping data such as ESTs, bacterial 
artificial chromosomes (BAC)s, or small genomes. Candidate EST clones repre- 
senting different genes are then analysed using RNA blot methods for size and tissue 
specificity and, if required, used as probes to isolate and identify the full length 
cDNA clone for further characterization. In practice however, the method is rather 
more involved, requiring bioinformatic and computer analysis coupled with 
confirmatory molecular studies. Vasmatzis et al. (1998) have described several 
problems m this fledgling approach, such as separating highly homologous 
sequences derived from different genes and an overemphasis of specificity for some 
EST sequences. However, since these problems will largely be addressed by the 
development of more suitable computer algorithms and an increased completeness 
of the EST database, it is likely that this approach to identifying differentially 
expressed genes may enjoy more patronage in the future. 



Problems and potential of differential expression techniques 

The holistic or single cell approach ? 

When working with in vivo models of differential expression, one of the first 
issues to consider must be the presence of multiple cell types in any given specimen 
For example, a liver sample is likely to contain not onlv hepatocytes, but also 
(potentially) Ito cells, bile ductule cells, endothelial cells, various immune cells (e g 
lymphocytes, macrophages and Kupffer cells) and fibroblasts. Other tissues will 
each have their own distinctive cell populations. Also, in the case of neoplastic tissue 
there are almost always normal, hyperplastic and/or dysplastic cells present in a 
sample. One must, therefore, be aware that genes obtained from a differential 
display experiment performed on an animal tissue model may not necessarily arise 
exclusively from the intended 'target' cells, e.g. hepatocytes/neoplastic cells. If 
appropriate, further analyses using immunohistochemistry, in situ hybridization or 
Mr situ RT-PCR should be used to confirm which cell types are expressing the 
gene(s) of interest. This problem is probably most acute for those studying the 
differentialexpression of genes in the development of different cell types, where 
there is a need to examine homologous cell populations. The problem is now being 
addressed at the National Cancer Institute (Bethesda, MD, USA) where new micro- 
disection techniques have been employed to assist in their gene analysis programme 
the Cancer Genome Anatomy Project (CGAP) (For more information see web site- 
http :/ /www.ncbi.nlm.nih.gov/ncicgap/intro.html). There are also separation tech- 
niques available that utilise cell-specific antigens as a means to isolate target cells, 
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e.g. fluorescence activated cell sorting (FACS) (Dunbar et al. 1998, Kas-Deelen et 
al. 1998) and magnetic bead technology (Richard et al. 1998, Rogler et al. 1998). 

However, those taking a holistic approach may consider this issue unimportant. 
There is an equally appropriate view that all those genes showing altered expression 
within a compromized tissue should be taken into consideration. After all, since all 
tissues are complex mixes of different, interacting cell types which intimately 
regulate each other's growth and development, it is clear that each cell type could in 
some way contribute (positively or negatively) towards the molecular mechanisms 
which lie behind responses to external stimuli or neoplastic growth. It is perhaps 
then more informative to carry out differential display experiments using in vivo as 
opposed to in vitro models, where uniform populations of identical cells probably 
represent a partial, skewed or even inaccurate picture of the molecular changes that 
occur. 

The incidence and possible implications of inter-individual biological variation 
should be considered in any approach where whole animal models are being used. It 
is clear that individuals (humans and animals) respond in different ways to identical 
stimuli. One of the best characterized examples is the debrisoquine oxidation 
polymorphism, which is mediated by cytochrome CYP2D6 and determines the 
pharmacokinetics of many commonly prescribed drugs (Lennard 1993, Meyer and 
Zanger 1997). The reasons for such differences are varied and complex, but allelic 
variations, regulatory region polymorphisms and even physical and mental health 
can all contribute to observed differences in individual responses. Careful thought 
should, therefore, be given to the specific objectives of the study and to the possible 
value of pooling starting material (tissue/mRNA). The effect of this can be 
beneficial through the ironing out of exaggerated responses and unimportant minor 
fluctuations of (mechanistically) irrelevant genes in individual animals, thus 
providing a clearer overall picture of the general molecular mechanisms of the 
response. However, at the same time such minor variations may be of utmost 
importance in deciding the ability of individual animals to succumb to or resist the 
effects of a given chemical/disease. 



How efficient are differential expression techniques at recovering a high percentage of 
differentially expressed genes ? 

A number of groups have produced experimental data suggesting that mam- 
malian cells produce between 8000-15000 different mRNA species at any one time 
(Mechler and Rabbitts 1981, Hedrick et al. 1984, Bravo 1990), although figures as 
high as 20-30000 have also been quoted (Axel et al. 1976). Hedrick et al. (1984) 
provided evidence suggesting that the majority of these belong to the rare abundance 
class. A breakdown of this abundance distribution is shown in table 1. 

When the results of differential display experiments have been compared with 
data obtained previously using other methods, it is apparent that not all differentially 
expressed mRNAs are represented in the final display. In particular, rare messages 
(which, importantly, often include regulatory proteins) are not easily recovered 
using differential display systems. This is a major shortcoming, as the majority of 
mRNA species exist at levels of less than 0.005 % of the total population (table 1). 
Bertioli et al. (1995) examined the efficiency of DD templates (heterogeneous 
mRNA populations) for recovering rare messages and were unable to detect mRNA 
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species present at less than 1.2% of the total mRNA population-equivalent to an 
mtermediate or abundant species: Interestingly, when simple model systems (single 
target only) were used instead of a heterogeneous mRNA population, the same 
primers could detect levels of target mRNA down to 10000X smaller. These results 
are probably best explained by competition for substrates from the many PCR 
products produced in a DD reaction. 

The numbers of differentially expressed mRNAs reported in the literature using 
various model systems provides further evidence that many differentially expressed 
mRNAs are not recovered. For example, DeRisi et al. (1997) used DNA array 
technology to examine gene expression in yeast following exhaustion of sugar in the 
medium, and found that more than 1700 genes showed a change in expression of at 
least 2-fold. In light of such a finding, it would not be unreasonable to suggest that 
of the 8000-15 000 different mRNA species produced by any given mammalian cell, 
up to 1000 or more may show altered expression following chemical stimulation. 
Whilst this may be an extreme figure, it is known that at least . 100 genes are 
actiyated/upregiilated in Jurkat (T-) cells following IL-2 stimulation (Ullman et al 
1990). In addition, Wan et al. (1996) estimated that interferon- y-stimulated HeLa 
cells differentially express up to 433 genes (assuming 24000 distinct mRNAs 
expressed by the cells). However, there have been few publications documenting 
anywhere near the recovery of these numbers. For example, in using DD to compare 
normal and regenerating mouse liver, Bauer et al. (1993) found only 70 of 38000 
total bands to be different. Of these, 50% (35 genes) were shown to correspond to 
differentially expressed bands. Chen et al. (1996) reported 10 genes upregulated in 
female rat liver following ethinyl estradiol treatment. McKenzie and Drake (1997) 
identified 14 different gene products whose expression was altered by phorbol 
mynstate acetate (PMA, a tumour promoter agent) stimulation of a human 
myelomonocytic cell line. Kilty and Vickers (1997) identified 10 different gene 
products whose expression was upregulated in the peripheral blood leukocytes of 
allergic disease sufferers. Linskens et al. (1995) found 23 genes differentially 
expressed between young and senescent fibroblasts. Techniques other than DD 
have also provided an apparent paucity of differentially expressed genes Using SH 
for example, Cao et al. (1997) found 15 genes differentially expressed in colorectal 
cancer compared to normal mucosal epithelium. Fitzpatrick et al. (1995) isolated 17 
genes upregulated in rat liver following treatment withthe peroxisome proliferator 
clofibrate; Philips et al. (1990) isolated 12 cDNA clones which were upregulated in 
highly metastatic mammary adenocarcinoma cell lines compared to poorly meta- 
static ones. Prashar and Weissman (1996) used 3' restriction fragment analysis and 
identified approximately 40 genes showing altered expression within 4h of 
activation of Jurkat T-cells. Groenink and Leegwater (1996) analysed 27 gene 
fragments isolated using SSH of delayed early response phase of liver regeneration 
and found only 12 to be upregulated. 

In the laboratory, SSH was used to isolate up to 70 candidate genes which appear 
to show altered expression in guinea pig liver following short-term treatment with 
the peroxisome proliferator, WY-14,643 (Rockett, Swales, Esdaile and Gibson, 
unpublished observations). However, these findings have still to be confirmed by 
analysis of the extracted tissue mRNA for differential expression of these sequences. 

Whilstthe latest differential display technologies are purpdrted to include design 
and experimental modifications to overcome this lack of efficiency (in both the total 
number of differentially expressed genes recovered and the percentage that are true 
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positives), it is still not clear if such adaptations are practically effective — proving 
efficiency by spiking with a known amount of limited numbers of artificial 
construct(s) is one thing, but isolating a high percentage of the rare messages already 
present in an mRNA population is another. Of course, some models will genuinely 
produce only a small number of differentially expressed genes. In addition, there are 
also technical problems that can reduce efficiency. For example, mRNAs may have 
an unusual primary structure that effectively prevents their amplification by PCR- 
based systems. In addition, it is known that under certain circumstances not all 
mRNAs have 3'polyA sites. For example, during Xenopus development, deadenyl- 
ation is used as a means to stabilize RNAs (Voeltz and Steitz 1998), whilst 
preferential deadenylation may play a role in regulating Hsp70 (and perhaps, 
therefore, other stress protein) expression in DrosophUa (Dellavalle et al. 1994). The 
presence of deadenylated mRNAs would clearly reduce the efficiency of systems 
utilizing a polydT reverse transcription step. The efficiency of any system also 
depends on the quality of the starting material. All differential display techniques 
use mRNA as their target material. However, it is difficult to isolate mRNA that is 
completely free of ri bosom al RNA. Even if polydT primers are used to prime first 
strand cDNA synthesis, ribosomal RNA is often transcribed to some degree 
(Clontech PCR-Select cDNA Subtraction kit user manual). It has been shown, at 
least in the case of SSH, that a high rRNArmRNA ratio can lead to inefficient 
subtractive hybridization (Clontech PCR-Select cDNA Subtraction kit user 
manual), and there is no reason to suppose that it will not do likewise in other SH 
approaches. Finally, those techniques that utilise a presubtraction amplification step 
(e.g. RDA) may present a skewed representation since some sequences amplify 
better than others. 

Of course, probably the most important consideration is the temporal factor. It 
is clear that any given differential display experiment can only interrogate a cell at 
one point in time. It may well be that a high percentage of the genes showing altered 
expression at that time are obtained. However, given that disease processes and 
responses to environmental stimuli involve dynamic cascades of signalling, 
regulation, production and action, it is clear that all those genes which are switched 
on/off at different times will not be recovered and, therefore, vital information may 
well be missed. It is, therefore, imperative to obtain as much information about the 
model system beforehand as possible, from which a strategy can be derived for 
targeting specific time points or events that are of particular interest to the 
investigator. One way of getting round this problem of single time point analysis is 
to conduct the experiment over a suitable time course which, of course, adds 
substantially to the amount of work involved. 



How sensitive are differential expression technologies? 

There has been little published data that addresses the issue of how large the 
change in expression must be for it to permit isolation of the gene in question with 
the various differential expression technologies. Although the isolation of genes 
whose expression is changed as little as 1.5-fold has been reported using SSH 
(Groenink and Leegwater 1996), it appears that those demonstrating a change in 
excess of 5-fold are more likely to be picked up. Thus, there is a 'grey zone' 
in between where small changes could fade in and out of isolation between 
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experiments and animals. DD, on the other hand, is not subject to this grey 
zone since, unlike SH approaches, it does not amplify the difference in expression 
between two samples. Wan et al. (1996) reported that differences in expression of 
twofold or more are detectable using DD. 

Resolution and visualization of differential expression products 

It seems highly improbable with current technology that a gel system could be 
developed that is able to resolve all gene species showing altered expression in any 
system (be h SH " or DD-based). Polyacrylamide gel electrophoresis 
(PAGE) can resolve size differences down to 0.2% (Sambrook et al. 1989) and are 
used as standard in DD experiments. Even so, it is clear that a complex series of gene 
products such as those seen in a DD will contain unresolvable components. Thus, 
what appears to be one band in a gel may in fact turn out to be several. Indeed, it has 
been well documented (Mathieu-Daude et al. 1996, Smith et al. 1997) that a single 
band extracted from a DD often represents a composite of heterogeneous products, 
and the same has been found for SSH displays in this laboratory (Rockett et al. 
1997). One possible solution was offered by Mathieu-Daude et al. (1996), who 
extracted and reamplified candidate bands from a DD display and used single strand 
conformation polymorphism (SSCP) analysis to confirm which components 
represented the truly differentially expressed product. 

Many scientists often try to avoid the use of PAGE where possible because it is 
technically more demanding than agarose gel electrophoresis (AGE). Unfortunately, 
high resolution agarose gels such as Metaphor (FMC, Lichfield, UK) and AquaPor 
HR (National Diagnostics, Hessle, UK), whilst easier to prepare and manipulate 
than PAGE, can only separate DNA sequences which differ in size by around 
1.5-2% (15-20 base pairs for a 1Kb fragment). Thus, SSH, RDA or other such 
products which differ in size by less than this amount are normally not resolvable. 
However, a simple technique does in fact exist for increasing the resolving power of 
AGE— the inclusion of HA-red (10-phenyl neutral red-PEG ligand) or HA-yellow 
(bisbenzamide-PEG ligand) (Hanse Analytik GmbH, Bremen, Germany) in a 
gel separates identical or closely sized products on base content. Specifically, 
HA-red and -yellow selectively bind to GC and AT DNA motifs, respectively 
(Wawer et al. 1995, Hanse Analytik 1997, personal communication). Since both 
HA-stains possess an overall positive charge, they migrate towards the cathode 
when an electric field is applied. This is in direct opposition to DNA, which 
is negatively charged and, therefore, migrates towards the anode. Thus,' if two 
DNA clones are identical in size (as perceived on a standard high resolution 
agarose gel), but differ in AT/GC content, inclusion of a HA-dye in the gel 
will effectively retard the migration of one of the sequences compared to the 
other, effectively making it apparently larger and, thus, providing a means of 
differentiating between the two. The use of HA-red has been shown to resolve 
sequences with an AT variation of less than 1 % (Wawer et al. 1995), whilst Hanse 
Analytik have reported that HA staining is so sensitive that in one case it was used 
to distinguish two 567bp sequences which differed by only a single point mutation 
(Hanse Analytik 1996, personal communication). Therefore, if one wishes to check 
whether all the clones produced from a specific band in a differential display 
experiment are derived from the same gene species, a small amount of reamplified 
or digested clone can be run on a standard high resolution gel, and a second aliquot 
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Figure 10. Discrimination of clones of identical/nearly identical size using HA -red. Bands of decreasing 
size (1-5) were extracted from the final display of a suppression subtractive hybridization 
experiment and cloned. Seven colonies were picked at random from each cloned band and their 
inserts amplified using PCR. The products were run on two gels, (A) a high resolution 2% agarose 
gel, and (B) a high resolution 2 % agarose gel containing 1 U/ml H A -red. With few exceptions all 
the clones from each band appear to be the same size (gel A). However, the presence of HA-red 
(gel B), which separates identically-sized DNA fragments based on the percentage of GC within 
the sequence, clearly indicates the presence of different gene species within each band For 
example, even though all five re-amplified clones of band 1 appear to be the same size, at least four 
diflerent gene species are represented. 



in a similar gel containing one of the HA-stains. The standard gel should indicate 
any gross size differences, whilst the HA-stained gel should separate otherwise 
unresolvable species (on standard AGE) according to their base content. Geisinger 
et al. (1997) reported successful use of this approach for identifying DD-derived 
clones. Figure 10 shows such an experiment carried out in this laboratory on clones 
obtained from a band extracted from an SSH display. 

An alternative approach is to carry out a 2-D analysis of the differential display 
products. In this approach, size-based separation is first carried out in a standard 
agarose gel. The gel slice containing the display is then extracted and incorporated 
in to a HA gel for resolution based on AT/GC content. 

Of course, one should always consider the possibility of there being different 
gene species which are the same size and have the same GC/AT content. However, 
even these species are not unresolvable given some effort— again, one might use 
SSCP, or perhaps a denaturing gradient gel electrophoresis (DGGE) or temperature 
gradient field electrophoresis (TGGE) approach to resolve the contents of a band, 
either directly on the extracted band (Suzuki et al, 1991) or on the reamplified 
product. 

The requirement of some differentia] display techniques to visualize large 
numbers of products (e.g. DD and GEF) can also present a problem in that, in terms 
of numbers, the resolution of PAGE rarely exceeds 300-400 bands. One approach to 
overcoming this might be to use 2-D gels such as those described by Uitterlinden et 
al. (1989) and Hatada et al. (1991). 
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Extraction of differentially expressed bands from a gel can be complex since, in 
some cases (e.g. DD, GEF), the results are visualized by autoradiographic means, 
such that precise overlay of the developed film on the gel must occur if the correct 
band is to be extracted for further analysis. Clearly, a misjudged extraction can 
account for many man-hours lost. This problem , and that of the use of radioisotopes, 
has been addressed by several groups. For example, Lohmann et al. (1995) 
demonstrated that silver staining can be used directly to visualize DD bands in 
horizontal PAGs. An et al. (1996) avoided the use of radioisotopes by transferring a 
small amount (20-30%) of the DNA from their DD to a nylon membrane, and 
visualizing the bands using chemiluminescent staining before going back to extract 
the remaining DNA from the gel. Chen and Peck (1996) went one step further and 
transferred the entire DD to a nylon membrane. The DNA bands were then 
visualized; using a digoxigenin (DIG) system (DIG was attached to the polydT 
primers used in the differential display procedure). Differentially expressed bands 
were cut from the membrane and the DNA eluted by washing with PCR buffer prior 
to reamplification. 

One of the advantages of using techniques such as SSH and RD A is that the final 
display can be run on an agarose gel and the bands visualized with simple ethidium 
bromide staining. Whilst this approach can provide acceptable results, overstating 
with SYBR Green I or SYBR Gold nucleic acid stains (FMC) effectively enhances 
the intensity and sharpness of the bands. This greatly aids in their precise extraction 
and often reveals some faint-products that may otherwise be overlooked. Whilst 
differential displays stained with SYBR Green I are better visualized using short 
wavelength UV (254 nm) rather than medium wavelength (306 nm), the shorter 
wavelength is much more DNA damaging. In practice, it takes only a few seconds 
to damage DNA extracted under 254 nm irradiation, effectively preventing 
reamplification and cloning. The best approach is to overstain with SYBR Green I 
and extract bands under a medium wavelength UV transillumination. 



The possible use of 'microfingerprinting ' to reduce complexity 

Given the sheer number of gene products and the possible complexity of each 
band, an alternative approach to rapid characterization may be to use an enhanced 
analysis of a small section of a differential display — a * sub-fingerprint ' or 'micro- 
fingerprint*. In this case, one could concentrate on those bands which only appear 
in a particular chosen size region. Reducing the fingerprint in this way has at least 
two advantages. One is that it should be possible to use different gel types, 
concentrations and run times tailored exactly to that region. Currently, one might 
run products from 100-3000 + bp on the same gel, which leads to compromize in the 
gel system being used and consequently to suboptimal resolution, both in terms of 
size and numbers, and can lead to problems in the accurate excision of individual 
bands. Secondly, it may be possible to enhance resolution by using a 2-D analysis 
using a HA-stain, as described earlier. In summary, if a range of gene product sizes 
is carefully chosen to included certain * relevant ' genes, the 2-D system standardized, 
and appropriate gene analysis used, it may be possible to develop a method for the 
early and rapid identification of compounds which have similar or widely different 
cellular effects. If the prognosis for exposure to one or more other chemicals which 
display a similar profile is already known, then one could perhaps predict similar 
effects for any new compounds which show a similar micro-fingerprint. 
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An alternative approach to microfingerprinting is to examine altered expression 
in specific families of genes through careful selection of PCR primers and /or post- 
reaction analysis. Stress genes, growth factors and/or their receptors, cell cycling 
genes, cytochromes P450 and regulatory proteins might be considered as candidates 
for analysis in this way. Indeed, some off-the-shelf DNA arrays (e.g. Clontech's 
Atlas cDN A Expression Array series) already anticipated this to some degree by 
grouping together genes involved in different responses e.g. apoptosis, stress, DNA- 
damage response etc. 



Screening 

False positives 

The generation of false positives has been discussed at length amongst the 
differential display community (Liang et al. 1993, 1995, Nishio et al. 1 994, Sun et al. 
1994, Sompayrac et al 1995). The reason for false positives varies with the 
technique being used. For instance, in RDA, the use of adaptors which have not 
been HPLC purified can lead to the production of false positives through illegitimate 
ligation events (O'Neill and Sinclair 1997), whilst in DD they can arise through 
PCR artifacts and illegitemate transcription of rRNA. In SH, false positives appear 
to be derived largely from abundant gene species, although some may arise from 
cDNA/mRNA species which do not undergo hybridization for technical reasons. 

A quick screening of putative differentially expressed clones can be carried out 
using a simple dot blot approach, in which labelled first strand probes synthesized 
from tester and driver mRNA are hybridized to an array of said clones (Hedrick et 
al. 1984, Sakaguchi et al 1986). Differentially expressed clones will hybridize to 
tester probe, but not driver. The disadvantage of this approach is that rare species 
may not generate detectable hybridization signals. One option for those using SSH 
is to screen the clones using a labelled probe generated from the subtracted cDNA 
from which it was derived, and with a probe made from the reverse subtraction 
reaction (ClonTechniques 1997a). Since the SSH method enriches rare sequences, 
it should be possible to confirm the presence of clones representing low abundance 
genes. Despite this quick screening step, there is still the need to go back to the 
original mRNA and confirm the altered expression using a more quantitative 
approach. Although this may be achieved using Northern blots, the sensitivity is 
poor by today's high standards and one must rely on PCR methods for accurate and 
sensitive determinations (see below). 



Sequence analysis 

The majority of differential display procedures produce final products which are 
between 100 and lOOObp in size. However, this may considerably reduce the size of 
the sequence for analysis of the DNA databases. This in turn leads to a reduced 
confidence in the result— several families of genes have members whose DNA 
sequences are almost identical except in a few key stretches, e.g. the cytochrome 
P450 gene superfamily (Nelson^ al 1996). Thus, does the clone identified as being 
almost identical to gene X 0 really come from that gene, or its brother gene X, or its 
as yet undiscovered sister X 2 ? For example, using SSH, part of a gene was isolated, 
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which was up-regulated in the liver of rats exposed to Wy-14,643 and was identified 
by a FASTA search as being transferrin (data not shown). However, transferrin is 
known to be downregulated by hypolipidemic peroxisome proliferators such as Wy- 
14,643 (Hertz et al. 1996), and this was confirmed with subsequent RT-PCR 
analysis. This suggests that the gene sequence isolated may belong to a gene which 
is closely related to transferrin, but is regulated by a different mechanism. 

A further problem associated with SH technology is redundancy. In most cases 
before SH is carried out, the cDNA population must first be simplified bv restriction 
digestion. This is important for at least two reasons : 

(1) To reduce complexity- long cDNA fragments may form complex networks 
which prevent the formation of appropriate hybrids, especially at the high 
concentrations required for efficient hybridization. 

(2) Cutting the cDNAs into small fragments provides better representation of 
individual genes. This is because genes derived from related but distinct 
members of gene families often have similar coding sequences that may cross- 
hybridize and be eliminated during the subtraction procedure (Ko 1990). 
Furthermore, different fragments from the same cDNA may differ considerably 
in terms of hybridization and amplification and, thus, may not efficiently do one 
or the other (Wang and Brown 1991). Thus, some fragments from differentially 
expressed cDNAs may be eliminated during subtractive hybridization pro- 
cedures. However, other fragments may be enriched and isolated. As a 
consequence of this, some genes will be cut one or more times, giving rise to two 
or more fragments of different sizes. If those same genes are differentially 
expressed, then two or more of the different size fragments mav come through 
as separate bands on the final differential display, increasing the observed 
redundancy and increasing the number of redundant sequencing reactions. 

Sequence comparisons also throw up another important point— at what degree 
of sequence similarity does one accept a result. Is 90% identitiy between a gene 
derived from your model species and another acceptably close? Is 95% between 
your sequence and one from the same species also acceptable? This problem is 
particularly relevant when the forward and reverse sequence comparisons give 
similar sequences with completely different gene species? An arbitrary decision 
seems to be to allocate genes that are definite (95% and above similarity) and then 
group those between 60 and 95% as being related or possible homologues. 

Quantitative analysis 

At some point, one must give consideration to the quantitative analysis of the 
candidate genes, either as a means of confirming that they are truly differentially 
expressed, or in order to establish just what the differences are. Northern blot 
analysis is a popular approach as it is relatively easy and quick to perform. However, 
the major drawback with Northern blots is that they are often not sensitive enough 
to detect rare sequences. Since the majority of messages expressed in a cell are of low 
abundance (see table 1), this is a major problem. Consequently, RT-PCR may be the 
method of choice for confirming differential expression. Although the procedure is 
somewhat more complex than Northern analysis, requiring synthesis of primers and 
optimization of reaction conditions for each gene species, it is now possible to set up 
high throughput PCR systems using mulitchannel pipettes, 96 +-well plates and 
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appropriate thermal cycling technology. Whilst quantitative analysis is more 
desirable, being more accurate and without reliance on an internal standard, the 
money arid time needed to develop a competitor molecule is often excessive, 
especially when one might be examining tens or even hundreds of gene species. The 
use of semi-quantitative analysis is simpler, although still relatively involved. One 
must first of all choose an internal standard that does not change in the test cells 
compared to the controls. Numerous reference genes have been tried in the past, for 
example interferon-gamma (IFN-y, Frye et al. 1989), 0-actin (Heuval et al. 1994), 
glyceraldehyde-3-phosphate dehydrogenase (GAPDH, Wong et al. 1994), di- 
hydrofolate reductase (DHFR, Mohler and Butler 1991), 0-2-microglobulin (0-2- 
m, Murphy et al, 1990), hypoxanthine phosphoribosyl transferase (HPRT, Fosse* 
al. 1998) and a number of others (ClonTechniques 1997b). Ideally, an internal 
standard should not change its level of expression in the cell regardless of cell age, 
stage in the cell cycle or through the effects of external stimuli. However, it has been 
shown on numerous occasions that the levels of most housekeeping genes currently 
used by the research community do in fact change under certain conditions and in 
different tissues (ClonTechniques 1997b). It is imperative, therefore, that pre- 
liminary experiments be carried out on a panel of housekeeping genes to establish 
their suitability for use in the model system. 

Interpretation of quantitative data must also be treated with caution. By 
comparing the lists of genes identified by differential expression one can perhaps 
gain insight into why two different species react in different ways to external stimuli. 
For example, rats and mice appear sensitive to the non-genotoxic effects of a wide 
range of peroxisome proliferators whilst Syrian hamsters and guinea pigs are largely 
resistant (Orton et al. 1984, Rodricks and Turnbull 1987, Lake et al. 1989, 1993, 
Makowska et ai 1992). A simplified approach to resolving the reason(s) why is to 
compare lists of up- and down-regulated genes in order to identify those which are 
expressed in only one species and, through background knowledge of the effects of 
the said gene, might suggest a mechanism of facilitated non-genotoxic carcinogenesis 
or protection. Of course, the situation is likely to be far more complex. Perhaps if 
there were one key gene protecting guinea pig from nori-genotoxic effects and it was 
upregulated 50 times by PPs, the same gene might only be up-regulated five times 
in the rat. However, since both were noted to be upregulated, the importance of the 
gene may be overlooked. Just to complicate matters, a large change in expression 
does not necessarily mean a biologically important change. For example, what is the 
true relevance of gene Y which shows a 50-fold increase after a particular treatment, 
and gene Z which shows only a 5-fold increase? If one examines the literature one 
may find that historically, gene Y has often been shown to be up-regulated 40-60- 
fold by a number of unrelated stimuli — in light of this the 50-fold increase would 
appear less significant. However, the literature may show that gene Z has never been 
recorded as having more than doubled in expression — which makes your 5-fold 
increase all the more exciting. Perhaps even more interesting is if that same 5-fold 
increase has only been seen in related neoplasms or following treatment with related 
chemicals. 

Problems in using th differ ntial display approach 

Differential display technology originally held promise of an easily obtainable 
* fingerprint* of those genes which are upr or down-regulated in test animals /cells in 
a developmental process or following exposure to given stimuli. However, it has 
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become clear that the fingerprinting process, whilst still valid, is much too complex 
to be represented by a single technique profile. This is because all differential display 
techniques have common and /or unique technical problems which preclude the 
isolation and identification of all those genes which show changes in expression. 
Furthermore, there are important genetic changes related to disease development 
which differential expression analysis is simply not designed to address. An example 
of this is the presence of small deletions, insertions, or point mutations such as those 
seen in activated oncogenes, tumour suppressor genes and individual poly- 
morphisms. Polymorphic variations, small though they usually are, are often 
regarded as being of paramount importance in explaining why some patients 
respond better than others to certain drug treatments (and, in logical extension, why 
some people are less affected by potentially dangerous xenobiotics /carcinogens' than 
others). The identification of such point mutations and naturally occurring 
polymorphisms requires the subsequent application of sequencing, SSCP, DGGE 
or TGGE to the gene of interest. Furthermore, differential display is not designed 
to address issues such as alternatively spliced gene species or whether an increased 
abundance of mRNA is a result of increased transcription or increased mRNA 
stability. 



Conclusions 

Perhaps the main advantage of open system differential display techniques is that 
they are not limited by extant theories or researcher bias in revealing genes which are 
differentially expressed, since they are designed to amplify all genes which 
demonstrate altered expression. This means that they are useful for the isolation of 
previously unknown genes which may turn out be useful biomarkers of a particular 
state or condition. At least one open system (SAGE) is also quantitative, thus 
eliminating the need to return to the original mRNA and carry out Northern /PCR 
analysis to confirm the result. However, the rapid progress of genome mapping 
projects means that over the next 5-10 years or so, the balance of experimental use 
will switch from open to closed differential display systems, particularly DNA 
arrays. Arrays are easier and faster to prepare and use, provide quantitative data, are 
suitable for high throughput analysis and can be tailored to look at specific signalling 
pathways or families of genes. Identification of all the gene sequences in human and 
common laboratory animals combined with improved DNA array technology, 
means that it will soon no longer be necessary to try to isolate differentially expressed 
genes using.the technically more demanding open system approach. Thus, their 
main advantage (that of identifying unknown genes) will be largely eradicated. It is 
likely, therefore, that their sphere of application will be reduced to analysis of the 
less common laboratory species, since it will be some time yet before the genomes of 
such animals as zebrafish, electric eels, gerbils, crayfish and squid, for example, will 
be sequenced. 

Of course, in the end the question will always remain: What is the functional/ 
biological significance of the identified, differentially expressed genes? One 
persistent problem is understanding whether differentially expressed genes are a 
cause or consequence of the altered state. Furthermore, many chemicals, such as 
non-genotoxic carcinogens, are also mitogens and so genes associated with 
replication will also be upregulated but may have little or nothing to do with the 
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carcinogenic effect. Whilst differential display technology cannot hope to answer 
these questions, it does provide a springboard from which identification, regulatory 
and functional studies can be launched. Understanding the molecular mechanism of 
cellular responses is almost impossible without knowing the regulation and function 
of those genes and their condition (e.g. mutated). In an abstract sense, differential 
display can be likened to a still photograph, showing details of a fixed moment in 
time. Consider the Historian who knows the outcome of a battle and the placement 
and condition of the troops before the battle commenced, but is asked to try and 
deduce how the battle progressed and why it ended as it did from a few still 
photographs— an impossible task. In order to understand the battle, the Historian 
must find out the capabilities and motivation of the soldiers and their commanding 
officers, what the orders were and whether they were obeyed. He must examine the 
terrain, the remains of the battle and consider the effects the prevailing weather 
conditions exerted. Likewise, if mechanistic answers are to be forthcoming, the 
scientist must use differential display in combination with other techniques, such as 
knockout technology, the analysis of cell signalling pathways, mutation analysis and 
time and dose response analyses. Although this review has emphasized the 
importance of differential gene profiling, it should not be considered in isolation and 
the full impact of this approach will be strengthened if used in combination with 
functional genomics and proteomics (2-dimensional protein gels from isoelectric 
focusing and subsequent SDS electrophoresis and virtual 2D-maps using capillary 
electrophoresis). Proteomics is attracting much recent attention as many of the 
changes resulting in differential gene expression do not involve changes in mRNA 
levels, as decribed extensively herein, but rather protein-protein, protein-DNA and 
protein phosphorylation events which would require functional genomics or 
proteomic technologies for investigation. 

Despite the limitations of differential display technology, it is clear that many 
potential applications and benefits can be obtained from characterizing the genetic 
changes that occur in a cell during normal and disease development and in response 
to chemical or biological insult. In light of functional data, such profiling will 
provide a ' fingerprint ' of each stage of development or response, and in the long 
term should help in the elucidation of specific and sensitive biomarkers for different 
types of chemical /biological exposure and disease states. The potential medical and 
therapeutic benefits of understanding such molecular changes are almost im- 
measurable. Amongst other things, such fingerprints could indicate the family or 
even specific type of chemical an individual has been exposed to plus the length 
and/or acuteness of that exposure, thus indicating the most prudent treatment. 
They may also help uncover differences in histologically identical cancers, provide 
diagnostic tests for the earliest stages of neoplasia and, again, perhaps indicate the 
most efficacious treatment. 

The Human Genome Project will be completed early in the next century and the 
DNA sequence of all the human genes will be known. The continuing development 
and evolution of differential gene expression technology will ensure that this 
knowledge contributes fully to the understanding of human disease processes. 
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A method and apparatus for forming microarrays of biologi- 
cal samples on a support are disclosed. The method involves 
dispensing a known volume of a reagent at each selected 
array position, by lapping a capillary dispenser on the 
support under conditions effective to draw a defined volume 
of liquid onto the support. Toe apparatus is designed to 
produce a microarray of such regions in an automated 
fashion. 
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METHODS FOR FABRICATING porous membrane. One amy includes pins that arc designed 

MICROARRAYS OF BIOLOGICAL SAMPLES <o spot a membrane in a staggered fashion, for creating an 

array of 9216 spots in a 22x22 cm area (Lchrach, ct al, 
CROSS-REFERENCE TO RELATED 1990). A limitation with ibis approach is that the volume of 

APPLICATION 5 ONA spoiled in each pixel of each array is highly variable. 

In addition, the number of arrays that can be made with each 
This application is a continuation-in-part of VS. patent dipping « usually quite small 
application Ser. No. 08/261,388, filed Jun. 17, 1994, and An alternate method of creating ordered arrays of nucleic 
now abandoned. acid sequences is described by Pirrung, et al. (1992), and 

The United States government may have certain rights in 10 *l*o by Fodor, et al. (1991). The method involves synthc- 
thc present invention pursuant to Grant No. HG0O450 sizing different nucleic acid sequences at different discrete 
awarded by the National Institutes of Health. regions of a support. This method employs elaborate syn- 

thetic schemes, and is generally limited to relatively short 
FIELD OF THE INVENTION nucleic acid sample, e.g., less than 20 bases., A related 

method has been described by Southern, et al. (1992) 
This invention relates to a method and apparatus for " Khnpko, cl d . (199I ) describes a method of making an 
fabricating microarrays of biological samples for large scale oligonucleotide matrix by spotting DNA onto a thin laver of 
screening assays, such as arrays of DNA samples to be used polyacrylamide. The spoiling is done manually with a 
in DNA hybridization assays for genetic research and diag- micropipettc. 

nusuc applioilions. ^ None of the methods or devices described in the prior an 

REFERENCES m dcs *£ Dcd *° r mass fabrication of microarrays character- 

ized by (i) a large number of miao-sized assay regions 
Abouzied, et aL, Journal of AOAC International 77(2) separated by a distance of 50-200 microns or less, and (ii) 
:4 95-500 (1994). • well-defined amount, typically in the picomole range, of 

Bohlander, et al., Genomics 13:1322-1324 (1992). 25 ^Y 1 * ««>ciated with each region of the array. 
Drmanac, et aL. Science 260:1649-1652 (1993). Furthermore, current technology is directed at performing 

Fodc. « 251:767-773 0*1). S^^'^^^^SC 

Khrapko, et al, DNA Sequence 1:375-3*8 (1991). ing DNAbybridizatioos to arrays spotted onto porous mem- 

Kuriyama, el al„ AN JSFET BIOSENSOR, APPLIED x brane involves sealing the membrane in a plastic bag 
BIOSENSORS (Donald Wise, Ed.), Butterwonhs, pp. ~ (Maniatas,et al, 1989) or ^ a bating glass cylinder (Robbins 
93-114 (1989). Scientific) with the labeled hybridization probe inside tbc 

Lehracb, et zU HYBRIDIZATION FINGERPRINTING IN Mltd chamber. For arrays made on noo-porous surfaces, 
GENOME MAPPING AND SEQUENCING, GENOME xuch ** 1 micrrwcn P c *t"k. each array is incubated wiih the 
ANALYSIS, VOL 1 (Davies and TUgbam, Eds.), Cold Spring 35 hybridization probe scaled under a covcrslip. These 

Harbor Press, pp. 39-81 (1990), techniques require a separate sealed chamber for each array 

Maniatis, et al., MOLECULAR CLONING, A IABORA* whicb mikcs tbc scn ^Z tod failing of many such 
TORY MANUAL, Cold Spring Harbor Press (1989). trwys mcODVcmcnl Umc "tcostve. 

Nelson, et al. Nature Genetics 4:11-18 (1993). h j£^' Cl ^ * ^ ° f 

rv , nc t), vr * i « j A o^v 40 homooul of antibodies on a nitrocellulose membrane 

Pirrung. et aL, VS. PaL No. 5,143,854 (1992). and separating regions 0 f the membrane with vertical stripes 

Riles, et al. Genetics 134:81-150 (1993). of a hydrophobic material. Each vertical stripe is then 

Schena, M. et al., Proc. Nat. Acad. Sci. USA reacted with a diffe rent antigen and the reaction between the 
89:3894-3898(1992). immobilized antibody and an antigen is detected using a 

Southern, et ah, Genomics 13:1008-1017 (1992). «5 standard EUSA ealorimelric technique. Abuuziar* tech- 

nique makes it possible to screen many one -dimensional 
BACKGROUND OF THE INVENTION arTlvs simultaneously on a single sheet of nitrocellulose. 

Abouzied makes the nitrocellulose somewhat hydrophobic 
A variety of methods are currenU>- available for making using » line drawn with PAP Pen (Research Products 
arrays of biological macromolecules, such as arrays of Je international). However, Abouzied docs not describe a tech- 
nucleic acid molecules or proteins. One method for making no logy that is capable of completely sealing the pores of the 
ordered arrays of DNA on a porous membrane is a "dot blot" nitrocellulose. The pores of the nitrocellulose are still phvsi- 
approach. In this method, a vacuum manifold transfers a cally open and so the assay reagents can leak through "tbc 
plurality, cg„ 96, aqueous samples of DNA from 3 milli- hydrophobic barrier during extended high temperature incu- 
meter diameter wells to a porous membrane. A common $$ baiions or in the presence of detergents, which makes toe 
variant of this procedure is a -slot-blot" method in which the Abouzied technique unacceptable for DNA hybridization 
wells have highly-elongated oval shapes. assays. 

The DNA is immobilized on the porous membrane by Porous membranes with printed patterns of hydrophilic/ 
baking the membrane or exposing it to UV radiation. This is hydrophobic regions exist for applications such as ordered 
a manual procedure practical for making one array at a time «, arrays of bacteria colonies. OA life Sciences (San Diego 
and usually limited to 96 samples per array. "Dol-bloT Calif.) makes such a membrane with a grid pattern printed 
procedures are therefore inadequate for applications in on it However, this membrane has the same disadvantage as 
which many thousand samples must be determined the Abouried technique since reagents can still flow between 

A more efficient technique employed for making ordered the gridded arrays making them unusable for separate DNA 
arrays of genomic fragments uses an array f pins dipped 65 hybridization assays. 

into the wells, eg., the 96 wells of a microiitre plate, for Pall Corporation make a 96-weU plate with a porous filter 
transferring an array of samples to a substrate, such as a heat sealed to the bottom of the plate. These plates are 
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capable of containing different reagents id eacb wen without place tbe dispensing device at a loading station, (ii) move tbe 

cross-coma mination. However, eacb well is ini ended to bold capillary channel io the device into a selected reagent at tbe 

onlv one target clement whereas the invention described loading station, to load tbe dispensing device with tbe 

here makes a microarray of many biomolecuks in each reagent, and (iii) dispense the reagent at a defined array 

subdivided region of tbe solid support. Furthermore, the 96 5 position on each of tbe supports on said holder. Tbe unit may 

well plates are at least 1 cm thick and prevent tbe use of the further operate, at the end of a dispensing cycle, to wash tbe 

device for many calorimetric, fluorescent and radioactive dispensing device by (i) placing tbe dispensing device at a 

detection formats which require that the membrane tie flat w^jng station, (ii) moving the capillar}* channel in the 

against tbe detection surface. The invention described here device into a wash fluid, to load the dispensing device with 

requires no further processing after the assay step since the J(? ^ fluid< ^ nmoyiog ^ Wlih fluid prior l0 loading 

barriers elements are shallow and do not interfere witb tbe ^ dlspeQS1Qg 6eyifX ^ a fresh s ^ tcttd r^ gM , 

detection step, thereby greatly moeasxng ^vcmcocx. ^ ^ ^ ^ ^ ^ of ^ 

Hyseq Corporation bas described . . method uf making an 0 f such devices which are carried on tbe arm for 

-array of arrays" on a non-porous solid support for use with dispeQsin ^ ^ reagents at selected 

their sequencing by hybridization technique. The method 3J ~^f^ y positions. 

described by Hyseq fnvorvesmodi J^fj^^ In another aspect, tbe inventioo includes a substrate with 

sobd support material to form a hydrophobic gnd pattern bavmTamicroamy of at least 1<P distinct poly- 

where each subdivided region contains a microarray of * , T\ 0 vr /r . "7 *^ y , 

b^olc^lcs. Hyseq's flat hydrophobic pattern doe* no. — »»«* or poiypephde taopolynxrs m a surface are. of 

™£ U ofphy^l blocking « an .ddWonal mean* of „ ^ iboul 1 ^ ^ b ^»y»« (0 » 

™4nngcrLcoDUnffl,.rior:. * /"f 0 ^ "» dc&icd posmon m satd array. <u)bas 

pivTwuuus wwo 4 length of at least 50 subumts, and (iii) is present in a 

SUMMARY OF THE INVENTION defined amount between about 0.1 femtomoles and 100 

The invention includes, in one aspect, a method of form- nanomoles. 
ing a microarray of analytc -assay regions on a solid support. In one embodiment, the surface is glass slide surface 
where each region in the array has a known amount of a 25 coated witb a polycationic polymer, such as polylysine, and 
selected, analyte-specific reagent. Tbe method involves first the biopolymers are polynucleotides. In another 
loading a solution of a selected analyte -specific reagent in a embodiment, the substrate bas a water-impermeable 
reagent-dispensing device having an elongate capillary backing, a water-permeable film formed on tbe backing, and 
channel (i) formed by spaced-apart, coextensive elongate a grid formed on the film. The grid is composed of inter- 
members, (ii) adapted to bold a quantity of tbe reagent 30 sccting water-impervious grid elements extending from said 
solution and (iii) having a tip region ai which aqueous backing to positions raised above tbe surface of said film, 
solution in the channel forms a meniscus. The channel is and partitions tbe film into a plurality of water-impervious 
preferably formed by a pair of spaced-apart tapered ele- cells. A biopolymcr array is formed within each well, 
meets. ij More generally, there is provided a substrate for use in 

Tbe tip of tbe dispensing device is tapped against a solid ~ detecting binding of labeled polynucleotides to one or more 

support at a defined position on tbe support surface with an of a plurality different-sequence, immobilized polynucle- 

impulse effective to break the meniscus in tbe capillary otides. The substrate includes, in ooe aspect, a glass support, 

channel, and deposit a selected volume nf solution on the a coating of a polycationic polymer, such as polylysine, oo 

surface, preferably a selected volume in the range 0.01 to ^ said surface of the support, and an array nf distinct pory- 

100 nl. Tbe two steps are repeated until the desired array is nucleotides electrostatically bound noo-covalently to said 

formed. . coating, where eacb distinct biopolymer is disposed at a 

The method may be practiced in funning a plurality of separate, defined position in a surface array of polynucle* 

such arrays, where the solution-depositing step is applied to o tides. 

a selected position on eacb of a plurality of solid supports at 4S J n another aspect, tbe substrate includes a water- 
each repeat cycle. impermeable backing, a water-permeable film formed on tbe 

The dispensing device may be loaded wiih a new solution, backing, and a grid formed on tbe film, where tbe grid is 

by the steps of (i) dipping the capillary channel of the device composed nf intersecting water- impervious grid elements 

in a wash solution, (ii) removing wash solution drawn into extending from the backing to positions raised above the 

tbe capillary channel, and (iii) dipping tbe capillary channel 50 surface of tbe film, forming a plurality of cells. A biopolymer 

into the new reagent solution. array is formed within eacb cell. 

Also included in the invention is an automated apparatus Also forming pan of tbe invention is a method of detecl- 

for forming a microarray of analyte -assay regions on a ing differential expression of each nf a plurality nf genes in 

plurality of solid supports, where eacb region in the array a first cell type, with respect to expression of the same genes 

bas a known amount of a selected, analyte-specific reagent. 55 in a second cell lype. Io practicing the method, there is first 

Tbe apparatus has a holder for holding, at known positions, produced fluoresccni-labelcil cDNAs from mRNAs isolated 

a plurality of planar supports, and a reagent dispensing from the two cells types, where the cDNAs from the first and 

device of the type described above. second cell types arc labeled with first and second different 

The apparatus further includes a positioning structure for fluorescent reporters, 

positioning the dispensing device at a selected array position 60 A mixture of the labeled cDN As from tbe two cell types 

with respect to a support in said bolder, and a dispensing is added to an array of polynucleotides representing a 

structure for moving the dispensing device into tapping plurality of known genes derived from tbe two cell types, 

engagement against a support with a selected impulse effec- under conditions that result in hybridization of the cDNAs to 

five to deposit a selected volume on the support, eg., a complementary -sequence polynucleotides in tbe array. Tbe 

selected volume in the volume range 0.01 to 100 nl c5 array is then examined by fluorescence under fluorescence 

Tbe positioning and dispensing structures are controlled excitation conditions in which (Q polynucleotides in the 

by a control unit in the apparatus. Tbe unit perates to (i) array that are hybridized predominantly to cDNAs derived 
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from ooe of the first or second cell types give a distinct first DETAILED DESCRIPTION OF *J*HE 

or second fluorescence emission color, respectively, and (ii) INVENTION 

polynucleotides in tbc array tbat are hybridized to substan- j T>Xi n iii om 

tialiv coual numbers of cDNAs derived from the first and . . a -, AmU -„ ... . * ... 

second «U types give a distinct combioed fluorescence S _ Unless tndacated others, the terms defined below h.ve 

emission color, respectively. The relative expression of tte »e»n»ngs. . , ,. , 

known genes in the two cell types can then be determined by J 8»«l . r ^ crx , t0 «* me " ,ber nf » >'g«nd/ant;-ligand 

the observed fluorescence emission color of each spot V** 1 * W»- Tne ligand may be, for example, ooe of toe 

These and other objects and features of the invention will D **j e m * *»Pk»«tary bybridized nucleic 

become more fullv apparent when the following detailed JO ^^*^W*nt^maikote U u*um 

descrip.i< m of the inve^i.m read in conjunction with the «c«ptorb»djngpw or an antigen man «ttgen/«nribody or 

aecompaoying figures. anugen/anttbody fragment btodmg par. 

The fik of this patent contains at least one drawing "^f^" Kim . "J* f" 1 **' °[ * 

executed in color. Copies of this patent with color drawing «P«H««? J* ano-hrjand may be the other of 

(s) wfll be provided by the Patent and trademark Office « , nude ^ m \ «^"«"^. hyhnd'/«l 

J. i ..Jmi «f th, w nucleic aad duplex blading pair; the receptor molecule id an 

upon request and payment of tbc necessary fee. effector/receptor binding pair, or id antibody or antibody 

BR1EK DfcSCRlKIION OF 1HH DRAWINGS fragment molecule in antigen/antibody or antigen/antibody 

... , .. . . fragment binding pair, respectively. 

FIG. 1 is a ade view of a reagent-dispensing device T7 , ^ „ " , , ... 

riM. * * al .„, j 5m _ e ^ Bh .,j_.1^^3fi»n« » Analyte" or "analyte molecule" refers to a molecule, 

^IZ^^S^onT ^amacromolecule, such as a po.ynucleotide or 

. . , . . . polypeptide, whose presence, amount, and/or identity arc to 

RfiS. 2A-2C Hlu^ate steps in the delivery of a fixed- £ ^crmmcd. Th c arwlvlc fc (fflc mcmbcr of , iWancl/mnli- 

volume bead on a hydrophobic surface employing the dis- jimd pair. " 

pensine bead from FIG. 1, in accordance with one embodi* - € M A , " . c . • . 

men. of the method of the invention; 25 ^^^pcafic assay reagenr refers lo a molecule 

Z. J. 1 \ t * a: , , effective to bind specificaUy to an analyte molecule. inc 

FIG. 3 shows a portion of ^two-dimensional array of n fe ^ 0 ^ le mcmber of a Hgand/anii-ligand 

analyic-assay regions constructed according to thc method binding pair. 

° f ^e in y cntI0n; An 44 amy of regions on a solid support- is. a linear or 

FIG. 4 is a planar view showing components of an K r^^imaxiovX array of preferably discrete regions, each 

automated apparatus for forming arrays in accordance wttb afinite ^ foimcd m the surface of a sobd support, 

the invention. A ** m j croam y" ^ aD arrav 0 f regions having a density of 

FIG. 5 shows a fluorescent image of an taual20x20 array discre|c rf tt kis| ^ joo/cm 2 . and preferablv at 

of 400 fluorescenUy-labeled DNA samples immobilized on leasl about 10 oo/cnr. Thc regions in a microarray have 

a poly-Hysine coated slide wbere the total area covered by 35 ^ dimensions, c ^ diners, m the range of between 

the 400 element array is 16 square millimeters; about 10-250 ^ and are separated from other regions in 

FIG. 6 is a fluorescent image of a 1.8 cmxl.8 cm the array bv about the same distance, 

imcroarray containing lambda clones with yeast inserts, the A ^ -hyompbobic" if a aqucrrnvrnedium 

fluorescent signal arising from the hybridization to toe array droplet applied to the surface docs not spread out substan- 

with approximately half the yeast genome labeled with a « tialiv beyond the area size of the applied dropleL That is, the 

green fluorophore and the other naif with a red fluorophore; sumce ^ w prcvcnl sprei ding of the droplet applied to the 

FIG. 7 shows the translation of the hybridization image of surface by hydrophobic interaction with the droplet 

FIG. 6 into a karyotype of the yeast genome, where the A -meniscus" means a concave or convex surface that 

elements of FIG. 6 microarray contain yeast DNAsequcnccs forms on the bottom of a liquid in a channel as a result of the 

that have been previously physically mapped in the yeast 43 surface tension of the liquid. 

genome; "DUtinci biopolymers'*, as applied to the biopolymers 

FIG. 8 shows a fluorescent image of a 0.5 cmxO.5 cm forming a microarray, means an amy mcmbcr which is 

microarray of 24 cDNA dunes, where thc microarray was distinct from other array members on the basis of a different 

hybridized simultaneously with total.cDNA from wild type biopolyroer sequence, and/or different concentrations of the 

Arabidopsis plant labeled with a green fluorophore and total same or distinct biopnlymcrs, and'or different mixtures of 

cDNA from a transgenic Arabidopsis plant labeled with a distinct or diffcrcnt-conccnuauon biopolymers. Thus an 

red fluorophore, and the arrow points to the cDNA clone arri y 0 f "distinct polynucleotides- means an array 

representing the gene introduced into the transgenic Arabi- containing, as its members, (i) distinct polynucleotides, 

dopsis plant; $5 which may have a defined amount in each member, (ii) 

FIG. 9 shows a plan view of substrate having an amy of different, graded concentrations of given-sequence 

cells funned by barrier elements in the form of a grid; polynucleotides, and/or (iii) different-composition mixtures 

FIG. 10 snows an enlarged plan view of one of the cells of two or more distinct polynucleotides, 

in the substrate in FIG. 9, showing an amy of potynucle- "Cell type" means a cell from a given source, e.g., a 

otide regions in the cell; w tissue, or organ, or a cell in a given state of differentiation, 

HG. 11 is an enlarged sectional view of the substrate in or a cell associated with a given pathology or genetic 

FIG. 9, taken along a section line in that figure; and makeup. 

FIG. 12 is a scanned image of a 3 cmx3 cm nitrocellulose U. Method of Microarray Formation 

solid support containing four identical arrays of M13 clones This section describes a method of forming a microarray 

in each of four quadrants, where each quadrant was hybrid- 65 of analytc-assay regions on a solid support or substrate, 

ized simultaneously to a different oligonucleotide using an where each region in the array has a known amount of a 

open face hybridization method. sele cte d, analyte-specific reagent. 
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HG. 1 illustrates, in a partially schematic view, a reagent- and away from tbc substrate surface, making momentary 

dispensing device 10 useful in practicing the method. The contact with the surface, in effect, tapping the tip of the 

device generally includes a reagent dispenser 12 having to dispenser against the support surface. The tapping move- 

elongate open capillary' channel 14 adapted to bold a quan- mcnt of tbc tip against the surface acts to break tbc liquid 

tity of tbc reagent solution, such as indicated at 16, as will 5 meniscus in the tip channel, bringing the liquid in the up into 

be described below. The capillary channel is formed by a contact with the support surface. This, in turn, produces a 

pair of spaced-apart, coextensive, elongate members 12a, flowing of the liquid into the capillary space between the tip 

12b which are tapered toward one another and converge at and the surface, acting to draw liquid out of the dispenser 

a tip or tip region 18 ai the lower end of the channel. More channel, as seen in FIG. 2B. 

generally, tbc open channel is formed by at least two ic FIG. 2C shows flow of fluid from the tip onto tbc support 

elongate, spiced -apart members adapted to bold a quantity surface, which in this case is a hydrophobic surface. The 

of reagent solutions and having a lip region at which figure illustrates that liquid continues to flow from tbc 

aqutaius solution in the channel farms a meniscus, such as dispenser onto the support surface until it forms a liquid 

the concave meniscus illustrated at 20 m FIG. 2A. The bead 32. Al a given bead sue, ix n volume, the tendency of 

advantages of the open channel construction of the dispenser 35 jjqujd to flow onto the surface will be balanced by the 

are discussed below. hydrophobic surface interaction of the bead with the support 

With continued reference to FIG. 1, the dispenser device surface, which acts to limit the total bead area on the surface, 

also includes structure for moving the dispenser rapidly and by me surface tension of the droplet, which tends toward 

toward and away from a support surface, for effecting a given bead curvature. At this point* a given bead volume 

deposition of a known amount of solution in the dispenser on 20 will have formed, and continued contact of the dispenser tip 

a support, as will be described below with reference to FIGS. with the bead, as the dispenser tip is being withdrawn, will 

2A-2C. In tbc embodiment shown, this structure includes a have little or no effect on bead volume, 

solenoid 22 which is activaiable to draw a solenoid piston 24 F or liqmd-dispensing on a more hydrophilic surface the 

rapidly downwardly, then release the piston, e.g., under liquid win have less of a tendency to bead, and the dispensed 

spring bias, to a normal, raised posiUon, as shown. The 25 vcAl3mt ^ ^ sensitive to the total dwell time of the 

dispenser is carried on the piston by a connecting member dispenser tip in the immediate vicinity of the support 

26. as shown. The just-described moving structure is also ^ c .g. ( the positions illustrated in FIGS. 2B and 2C. 

referred to herein as dSapenong means ft* moving the ^ desired deposition volume, Lc.. bead volume, funned 

denser into ^^^^^3^ « «* in «ne range 2 pi (pieoiiters) to 

ing a known volume of fluid on the support x 2 ^ (MDoli ters), mougD volunjcs * ^ormore 

'J be dispensing device just described is earned on an arm may be dispensed. It will be appreciated that the selected 

2* that may be moved either linearly or in an x-y plane to dispensed volume will depend on (i) the -footprint'' of the 

position the dispenser at a selected deposition position, as dispenser tip, the size of the area spanned by the tip, (ii*) 

will be described. mc bydrophobiciry of the support surface, and (iii) the time 

FIGS. 2A-2C illustrate the method of depositing a known " of contact with and rate of withdrawal of the tip from the 

amount of reagent solution in the just -described dispenser on support surface. In addition, bead size may be reduced by 

the surface of a solid support, such as the support indicated increasing the viscosity of the medium, effectively reducing 

at 30. The support is a polymer, glass, or other solid-material the flow time of liquid from the dispenser onto the support 

support having a surface indicated at 31. ^ surface. The drop size may be further constrained by depos- 

In one general embodiment, the surface is a relatively iting the drop in a hydrophilic region surrounded by a 

hydrophilic, i.c, wettable surface, such as a surface having hydrophobic grid pattern on the support surface. 

- native, bound or covalenlly attached charged groups. One In a typical embodiment, the dispenser tip is tapped 

such surface described below is a glass surface having an rapidly against the support surface, with a total residence 

absorbed layer of a polycationic polymer, such as poly-1- 4J time in contact with the support of less than about 1 msec, 

lysine. i and a rate of upward travel from the surface of about 10 

in another embodiment, the surface has or is formed to cm/sec. 

have a relatively hydrophobic character, ie^ one that causes Assuming that the bead that forms on contact with the 

aqueous medium deposited on the surface to bead. A variety surface is a hemispherical bead, with a diameter approxi- 

of known hydrophobic polymers, such as polystyrene, ^ matchy equal to the width of the dispenser tip, as shown in 

polypropylene, or polyethylene have desired hydrophobic FIG. 2C, the volume of the bead formed in relation to 

properties, as do glass and a variety of lubricant or other dispenser tip width (d) is given in Table 1 below. As seen, the 

hydrophobic films that may be applied to the support sur- volume of the bead ranges between 2 pi to 2 nl as the width 

face. size is increased from about 20 to 200 fan. 

Initially, the dispenser is loaded with a selected anaryle- 55 

specific reagent solution, such as by dipping the dispenser TABLE 1 
tip, after washing, into a solution of the reagent, and 
allowing filling by capillary flow into the dispenser channel. 
The dispenser is now moved to a selected position with 
respect to a support surface, placing the dispenser tip 60 
directly above the support-surface position at which the 
reagent is to be deposited. This movement takes place with 
the dixpeaser tip in its raised position, a* seen in FIG. 2 A, 

where tbc tip is typically at least several 1-5 mm above the At a given tip size, bead volume can be reduced in a 

surface f the substrate. C5 controlled fashion by increasing surface hydrophobicity, 

. With the dispenser so positioned, solenoid 22 is now reducing time of contact of the tip with the surface, increas- 

activated to cause the dispenser tip to move rapidly toward ing rate of movement of the tip away from the surface, 
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and/or increasing the viscosity of the medium. Once these Solenoid 76 is under the control of a cootrol unit 77 whose 

parameters are fixed, a selected deposition volume is the operation will be described below. The solenoid is also 

desired pi to nl range can be achieved in a rcpcauble referred to herein as dispensing means for moving the device 

fashion. ml ° la Ppi n B engagement with a support, when the device is 

After depositing a bead ai one selected location on a * positioned ai a defined amy position with respect to that 

support, the tip is typically moved to a corresponding support- 
position on a second support, a droplet is deposited at that The dispenser device is carried on an arm 74 which is 

position, and this process is repeated until a liquid droplet of threadcdly mounted on a worm screw 80 driven (routed) in 

the reacent has been deposited at a selected position on each a desired direction by a stepper motor 82 also under ihe 

of a plurality of supports. 10 control of unit 77. At its left end in the figure screw 80 is 

The tip is then washed to remove the reagent liquid, filled carried in a sleeve M fur luuiicm about the strew axis. Al 

with another reagent liquid and this reagent is now deposited its other end, the screw is mounted to the dnve shaft of the 

at each anotbcrTiray position on each of the supports; In one «epi*r motor, which in turn is carried on a sleeve 86. The 

embodiment, the tip is washed and refilled by the steps of (0 dispenser device, worm screw, the two sleeves mounting the 

dippinc the capillary channel of the device in a wash 35 worm screw, and the stepper motor used m moving the 

solution (u) removing wash solution drawn into the capil- device in the «V (horizontal) direction in the figure form 

lary cbanneL. and (iii) dipping the capillary channel into the what is referred to here collectively as a displacement 

new reagent solution. assembly 86. 

From the foregoing, it wfll be appreciated that the The displacement assembly is constructed to produce 

rweezcrs-likc open-capillary dispenser tip provides the 20 precise, micro-range movement in the direction of the screw, 

advantages that (0 tbc open channel of the lip facilitates U., along an x axis in the figure. In one mode, the assembly 

rapid, efficient washing and drying before reloading the tip functions to move the dispenser in x-axis increments having 

with a new reagent, (is) passive capillary action can load the * selected distance in the range 5-25 pan. In another mode, 

sample directly from a standard microwcD plate while ^ the dispenser unit may be moved in precise x-axis incre- 

retaining sufficient sample in the open capillary reservoir for 25 menu of several microns or more, for positioning the 

the printing of numerous arrays, (iii) open capillaries are less dispenser at associated positions on adjacent supports, as 

prone to clogging than closed capillaries, and (iv) open will be described below. 

capillaries do not require a perfectly faced bottom surface The displacement assembly, in turn, is mounted for movc- 

for fluid delivery. „ menl in the -/* (vertical) axis of the figure, for positioning 

A portion of a microarray 36 formed on the surface 38 of 30 the dispenser at a selected y axis position. The structure 

a solid support 40 in accordance with the method just mounting the assembly includes a fixed rod 88 mounted 

described is shown in FIG. 3. Tnc array is formed of a rigidly between a pair of frame bars 90, 92, and a worm 

plurality of analyte-specific reagent regions, such as regions screw 94 mounted for rotation between a pair of frame bars 

42 where each region may include a different analyte. 96, *98. The worm screw is driven (muted) by a stepper 

specific reagent As indicated above, the diameter of each motor 100 which operates under the control of unit 77. Tbc 

region is preferably between about 20-200 ^m. 'Jne spacing motor is mounted on bar 96, as shown, 
between each region and its closest (non-diagonal) neighbor, The structure just described, including worm screw 94 

measured from center-to-ccnier (indicated at 44), is prefer- and motor 300, is constructed to produce precise, micro- 

ably in the range of about 20-400 /an. Thus, for example, an ^ range movement in the direction of the screw, ix., along a 

array having a center-to-cemer spacing of about 250 /an y axis in the figure. As above, the structure functions in one 

contains about 40 regions/cm or 1,600 regions/cm 2 . After mode to move the dispenser in y-axis increments having a 

formation of the array, the support is treated to evaporate the selected distance in the range 5-250 /on, and in a second 

liquid of the droplet forming each region, to leave a desired mode, to move the dispenser in precise y-axis incremenu of 

array of dried, relatively flat regions. This drying niay be ^ several microns Oan) or more, for positioning the dispenser 

done by beating or under vacuum, at associated positions on adjacent supports. 

In some ca vs . it is desired to first rebydrate tbc droplets The displacement assembly and structure for moving this 

containing the analyte reagents to allow for more time for assembly in the y axis are referred to herein collectively as 

adsorption to the solid support. It is also possible to spot out positioning means for positioning the dispensing device at a 

the analyte reagents in a humid environment so that droplets $Q selected amy position with respect to a support, 
do not dry until the arraying operation is complete. A bolder 102 in the apparatus functions to bold a plurality 

III. Automated Apparatus for Forming Arrays of supports, such as supports 104 on which the micmarrays 

In another aspect, the invention includes an automated of reagent regions are to be formed by the apparatus. The 
apparatus for forming an array of analyte -assay regions on holder provides a number of recessed slots, such as slot 106, 
a solid support, where each region in toe amy has a known 55 which receive the supports, and position them at precise 
amount of a selected, analyte-specific reagent selected positions with respect to the frame bars on which 

The apparatus is shown in planar, and partially schematic the dispenser moving means is mounted, 
view in FIG. 4. A dispenser device 72 in the apparatus has As noted above, tbc control unit in the device functions to 
the basic construction described above with respect to FIG. actuate the two stepper motors and dispenser solenoid in a 
1, and igclvfV* a dispenser 74 having an open-capillary 60 sequence designed for automated operation of tbc apparatus 
channel terminating at a tip, substantially as shown in FIGS. in forming a selected microarray of reagent regions on each 
1 and 2A-2C. of * plurality of supports. 

The dispenser is mounted in the device for movement The control unit is constructed, according to conventional 
toward and away from a dispensing position at which the tip microprocessor control principles, to provide appropriate 
of tbc dispenser taps a support surface, to dispense a selected 65 signals to each of the solenoid and each of the stepper 
volume of reagent solution, as described above. This move- motors, in a given limed sequence and for appropriate 
menl is effected by a solenoid 76 as described above. signau^gtime.Tbeconsmiaionofmeuiu^ 
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that are selected by the user to achieve a desired array "J"he construction of substrate is shown cross -scciionaily 
pattern, will be understood from the following description of in FIG. 11, wbicb is an enlarged sectional view takeo along 
a typical apparatus operation. v >cw line 124 in FIG. 9. The substrate includes a water- 
Initially, one or more supports are placed in one or more impermeable backing 126, such as a glass slide or rigid 
slots in the bolder The dispenser is then moved to a position 5 polymer sheet. Formed on the surface of the backing is a 
directly above a well (not shown) containing a solution of water-permeable film 128. The film is formed of a porous 
the first reagent to be dispensed on the supports). The membrane material, such as nitrocellulose membrane, or a 
dispenser solenoid is actuated now to lower the dispenser tip porous web material, such as a nylon, polypropylene, or 
into this well, causing the capillarv channel in the dispenser PVDF porous polymer material. The thickness of the film is 
to fill Motors 82, 100 arc now* actuated to position the . preferably between about 10 and 1000 /m. The film may be 
dispenser at a selected array position at the first of the applied to the backing by spraying or coaling uncured 
supports Solenoid actuation of the dispenser is then cUcc- material on the backing, or by applying a preformed mem- 
tive to dispense a selected-volume droplet of that reagent at brane to the backing. The backing and film may be obtained 
this location As noted above, this operation is effective to as a preformed unit from commercial source, cg„ a plastic- 
dispense a selected volume preferably between 2 pi and 2 nl backed nitrocellulose film available from Schleicher and 
of the reagent solution. 15 SchueU Corporal**. ^ 

The dispenser is oow moved to the corresponding position With continued reference to FIG. U, the film-covered 

at an adjacent support and a similar volume of tbc solution surface in the substrate » P™™6d into a desired array of 

is dispensed at tuition. The process is repeated until the by waternmpermeable gnd lin^sucb as lmcs 13£ 

& BBpcww« ~ rt,;r««.c*u^*H%«^e«r,«rfin« 132. which hive infiltrated the film down to the level of the 

reagent has ^° *f^ d * P^lected corresponding ^ ^ exteod ^ 0 f the film as shown, 

position on each of the support*. _ , of 100 to 2000 /«n above the film 

Where it is desired to dispense a single reagent at more ^fcce 

than two array positions on a support, the dispenser may be 7^ grid lines are formed on the substrate by laying tfciwn 

moved to different array poauons at each support, before u miaiied ^ otherwise fiowablc resin or elastomer solutioo 

moving the dispenser to a new support, or solution can be ^ fa ^ allowing the material to infiltrate the porous 

dispensed at tndmdual positions on each support, at one ^ down w tbc b^g, theo o^g or otherwise harden- 

selected position, then the cycle repeated for each new array ^ thc grid lines to form the cell-array substrate, 

position. One preferred material for the grid is a flowable silicone 

To dispense the next reagent, the dispenser is positioned aV ailable from Loctite Corporation. The barrier material can 

over a wash solution (not shown), and the dispenser tip is ^ ^ CXU}1 fa} through a narrow syringe (eg., 22 gauge) using 

dipped in and out of this solution until the reagent solution ^ prcssurc or mccbim ic*l pressure. The syringe is moved 

has been substantially washed from the up. Solution can be reUtive to ^ ^ ^ppon l0 prim m c barrier elements as 

removed fjum the lip, after each dmping, by vacuum, a grid pattern. The extruded bead of silicone wicks into the 

compressed air spray, sponge, or the like. pores of the solid support and cures to form a shallow 

Thc dispenser tip is now dipped in a second reagent well, 35 waterproof barrier separating the regions of the solid sup- 
and the filled tip is moved to a second selected array position 

in the first support. The process of c^spensing reagent at each i 0 ^croativc embodiments, the barrier clement can be a 

of tbc corresponding second-array positions is then carried wax-based material or a mcrmoset material such as epoxy. 

out as above. This process is repeated until an entire The barrier matcriil can also be a UV-curing polymer which 

imcroarray of reagent solutions on each of the supports has ^ ^ etTJtlNa j U} JJV light after being printed onto the solid 

been formed. support. The barrier material may also be applied to the solid 

IV. Microarray Substrate support using printing techniques such as silk-screen print- 

This section describes embodiments of a substrate having ing. The barrier material may also be a beat -seal stamping of 

a microarray of biological polymers carried on the substrate the porous solid support which seals its pores and forms a 

surface. Subsection A describes a multi-cell substrate, each as water-impervious barrier element. '1 he barrier material may 

cell of which contains a microarray, and preferably an also be a shallow grid which is laminated or otherwise 

identical microarray, of distinct biopolymcrs, such as dis- adhered to the solid support. 

tinct polynucleotides, formed on a porous surface. Subsec- lo addition to plastic-backed nitrocellulose, thc solid 

rion B describes a microarray of distinct polynucleotides support can be virtually any porous membrane with or 

bound on a glass slide coated with a polycationic polymer. 50 without a non-porous backing. Such membranes are readily 

A. Multi-Cell Substrate available from numerous vendors and arc made from nylon, 

FIG. 9 illustrates, in plan view, a substrate 110 constructed PVDF, polysulfone and the like. In an alternative 

according to the invention. The substrate has an 8x12 embodiment, the barrier element may also be used to adhere 

rectangular array 112 of cells, such as cells 114, 116, formed thc porous membrane to a non-porous backing in addition to 

on the substrate surface. With reference to FIG. 10, each cell, 55 functioning as a barrier 10 prevent cross contamination of the 

such as cell 114, in mm supports a microarray 118 of distinct assay reagents. 

butpolymers, such as polypeptides or polynucleotides at In an alternative embodiment, the solid support can he of 

known, addressable regions of the microarray. Two such a non-porous material. The barrier can be printed either 

regions forming the microarray arc indicated at 120, and before or after the microarray of biomolecules is printed on 

correspond to regions, such as regions 42, forming the 60 the solid support. 

microarray of distinct biopolymcrs shown in FIG. 3. As can be appreciated, tbc cells formed by thc grid lines 

The *J6-celI array shown in FIG. 9 typically has array and the underlying backing are water-impermeable, having 

dimensions between about 12 and 244 mm in width and 8 side barriers projecting above the porous film in the cells, 

and 400 mm in length, with thc cells in tbc array having Thus, defined-volumc samples can be placed in each well 

width and length dimension of Vis and Vi tbc array width and cs without risk of cross-contamination with sample material in 

length dimensions, respectively, i.c, between about 1 and 20 adjacent cells. In FIG. 11, defined volumes samples, such as 

in width and 1 and 50 mm in length. sample 154, are shown in the cells. 
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As noted above, cadi well contains a microarray of 
distinct biopolymers. Id one general embodiment, the 
microarrays in tbe well arc identical arrays of distinct 
biopolymers, e.g., different sequence polynucleotides. Such 
arrays can be formed in accordance with the methods 5 
described in Section II, by depositing a first selected poly- 
nucleotide at the same selected microarray position in each 
of tbe cells, then depositing a second polynucleotide at a 
different microarray position in each well, and so on until a 
complete, identical microarray is formed in each cell k 

In a preferred embodiment, each microarray contains 
about 10* distinct polynucleotide in\ polypeptide biopoly- 
mets per surface area of less than about 1 cm 3 . Also in a 
preferred embodiment, the biopolymers is each microarray 
region are present in a defined amount between about 6.1 35 
femtomoles and 100 naoomoles. The ability to form high- 
density arrays of biopolymers, where each region is formed 
of a well -defined amount of deposited material, can be 
achieved in accordance with tbe microar ray-forming method 
described in Section II. 20 

Also in a preferred embodiment/the biopolymers are 
polynucleotides having lengths of at least about 50 bp, Lc, 
substantially longer than oligonucleotides which can be 
formed in high -density arrays by schemes involving parallel, 
step-wise polymer synthesis on tbe array surface. 35 

In the case of a polynucleotide array, in an assay 
procedure, a small volume of the labeled DNA probe mix- 
ture in a standard hybridization solution is loaded onto each 
cell. The solution will spread to cover the entire microarray 
and stop at the barrier elements. The solid support is then »o 
incubated in a humid chamber at the appropriate temperature 
as required by the assay. 

Each assay may be conducted in an "open -face* format 
where 00 further sealing step is required, since the hybrid- 
ization solution will be kept properly bydraied by the water 35 
vapor in tbe humid chamber. At tbe cooclusioo of the 
incubation step, the entire solid support containing the 
numerous microarrays is rinsed quickly enough to dilute the 
assay reagents so that no significant cross contamination 
occurs. Tbe enure solid support is then reacted with detec* 40 
tion reagents if needed and analyzed using standard 
calorimetric, radioactive or fluorescent detection means. All 
processing and detection steps are performed simulta- 
neously to all of the microarrays on the solid support 
ensuring uniform assay conditions for all of the microarrays 45 
on the solid support. 

B. Glass-Slide Polynucleotide Array 

FIG. 5 shows a substrate 136 formed according to another 
aspect of the invention, and intended for use in detecting 
binding of labeled polynucleotides to one or more of a 50 
plurality distinct polynucleotides. Tbe substrate includes a 
glass substrate 138 having formed on its surface, a coating 
of a polycationic polymer, preferably a cationic polypeptide, 
such as polylysine or polyarginine. Formed on the polyca- 
tionic coating is a microarray 140 of distinct 55 
polynucleotides, each localized at known selected array 
regions, such as regions 142. 

The slide is coated by placing a uniform-thickness film of 
a polycationic polymer, e.g., poly-Mysinc, 00 the surface of 
a slide and drying tbe film to form a dried coating. Tbe 60 
amount of polycationic polymer added is sufficient to form 
at least a monolayer of polymers on tbe glass surface. 'Jfee 
polymer film is bound to surface via electrostatic binding 
between negative silyl-OH groups on tbe surface and 
charged amine groups is the polymers. Poly-Mysinc coated 65 
glass slides may be obtained commercially, e.g., from Sigma 
Qiemical Co. (St. Louis, Mo.). 



To form tbe microarray, defined volumes of distinct 
polynucleotides are deposited on tbe polymer-coated slide, 
as described io Section D. According to an important feature 
of the substrate, the deposited polynucleotides remain bound 
to tbe coated slide surface oon-covalently when an aqueous 
DNA sample is applied to tbe substrate under conditions 
which allow hybridization of rcponcr-labclcd polynucle- 
otides in the sample to complementary-sequence (single- 
stranded) polynucleotides in tbe substrate array. Tbe method 
is illustrated in Examples 1 and 2. 

To illustrate this feature, a substrate of tbe type just 
described, but having an array of same -sequence 
polynucleotides, was mixed with fluorescent-labeled 
complementary DNA under hybridization conditions. After 
washing to remove non-hybridized material, the. substrate 
was examined by low-power fluorescence microscopy. Tbe 
array can be visualized by the relatively uniform labeling 
pattern of tbe array regions. 

In a preferred embodiment, each microarray coouins at 
least ICr distinct polynucleotide or polypeptide biopolymers 
per surface area of less than about 1 cm . In the embodiment 
shows in FIG. 5, the microarray contains 400 regions in an 
area of about 16 mm 3 , or 2-5x1 0 3 regions/cm 3 . Also in a 
preferred embodiment, tbe polynucleotides in each microar- 
ray region are present in a defined amount between about 0.1 
femtomoles and 100 naoomoles in tbe case of polynucle- 
otides. As above, tbe ability to form high-density arrays of 
this type, where each region is formed of a well-defined 
amount of deposited material, can be achieved in accordance 
with tbe microarray-forming method described in Section II. 

Also in a preferred embodiment, the polynucleotides have 
lengths of at least about 50 bp, i.e., substantially longer than 
oligonucleotides which can be formed in high -density arrays 
by various in situ synthesis schemes. 

V. Utility 

Microarrays of immobilized nucleic acid sequences pre- 
pared in accordance with tbe invention can be used for large 
scale hybridization assays in numerous genetic applications, 
including genetic and physical mapping of genomes, moni- 
toring of gene expression, DNA sequencing, genetic 
diagnosis, genotyping of organisms, and distribution of 
DNA reagents 10 researchers. 

For gene mapping, a geoe or a cloned DNA fragment is 
hybridized to an ordered array of DNA fragments, and tbe 
identity of tbe DNA elements applied to tbe array is unam- 
biguously established by tbe pixel or pattern of pixels of tbe 
array that arc detected. One application of such arrays for 
creating a genetic map is described by Nelson, et al (1993). 
In constructing physical maps of tbe genome, arrays of 
immobilized cloned DNA fragments are hybridized with 
other cloned DNA fragments to establish whether tbe cloned 
fragments in the probe mixture overlap and are therefore 
contiguous to tbe immobilized clones on the array. For 
example, Lebracb, et aL, describe such a process. 

Tbe arrays of immobilized DNA fragments may also be 
used for genetic diagnostics. To illustrate, an array contain- 
ing multiple forms of a mutated gene or genes can be pmbed 
with a labeled mixture of a patient's DNA which will 
preferentially interact with only one of tbe immobilized 
versions of the gene. 

The detection of this interaction can lead to a medical 
diagnosis. Arrays of immobilized DNA fragments can also 
be used in DNA probe diagnostics. For example, tbe identity 
of a pathogenic microorganism can be established unam- 
biguously by hybridizing a sample of tbe unknown patho- 
gen's DNA to an array containing many types of known 
pathogenic DNA. A similar technique can also be used for 
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unambiguous geootyping of aoy organism. Other molecules 
of genetic interest, such as cDNAs and RNAs cao be 
immobilized oo ibe array or alternately used as the labeled 
probe mixture that is applied to the array. 

Id one application, ao array of cDNA dooes representing 
genes is hybridized with total cDNA from an organism to 
monitor gene expression for research or diagnostic purposes. 
Labeling total cDNA from a normal cell with one color 
fiuorophore and total cDNA from a diseased cell wiin 
another color fiuorophore and simultaneously hybridizing 
the two cDNA samples to the same array of cDNA clones 
allows for differential gene expression to be measured as the 
ratio of the two fiutirtjphorc intensities. This two-color 
experiment can be used to monitor gene expression in 
different tissue types, disease states, response to drugs, or 
response to environmental factors. An example of this 
approach is illustrated is Example 2, described with respect 
to HU. 8. 

By way of example and without implying a limitation of 
scope, such a procedure could be used to simultaneously 
screen many patients against all known mutations in a 
disease gene. This invention could be used in the form of, for 
example, 96 identical 0.9 cmx2-2 cm microarrays fabricated 
on a single 12 cmxl8 cm sheet of plastic-backed nitrocel- 
hilose where each microarray could contain, for example, 
100 DNA fragments representing all known mutations of a 
given gene. The region of interest from each of the DNA 
samples from 96 patients could be amplified, labeled, and 
hybridized to the 96 individual arrays with each assay 
performed in 100 microliters of hybridization solution. The 
approximately 1 thick silicone rubber barrier elements 
between individual arrays prevent cms-contamination of 
the patient samples by sealing the pores of the mtrocellulose 
and by acting as a physical barrier between each microarray. 
The solid support containing all 96 microarrays assayed with 
the 96 patient samples is incubated, rinsed, detected and 
analyzed as a single sheet of material using standard 
radioactive, fluorescent, or colorimetric detection means 
(Maniatas, ct ai, 1989). Previously, such a procedure would 
involve the handling, processing and tracking of 96 separate 
membranes in 96 separate sealed chambers. Dy processing 
ill 96 arrays as a single sheet of material, significant time 
and cost savings are possible. 

The assay format can be reversed where the patient or 
organism's UNA is immobilized as the array elements and 
each array is hybridized with a different mutated allele or 
genetic marker. The gridded solid support can also be used 
for parallel ooo-DNA ELISA assays. Furthermore, the 
invention allows for the use of all standard detection meth- 
ods without the need to remove the shallow barrier elements 
to carry out the detection step. 

In addition to the genetic applications listed above, arrays 
of whole cells, peptides , enzymes, antibodies, antigens, 
receptors, ligands, phospholipids, polymers, drug cogener 
preparations or chemical substances can be fabricated by the 
means described in this invention for large scale screening 
assays in medical diagrKKu'cs, drug discovery, molecular 
biology, immunology and toxicology. 

The multi-cell substrate aspect of the invention allows for 
the rapid and convenient screening of many DNA probes 
against many ordered arrays of DNA fragments. This elimi- 
nates the oeed to handle and detect many individual arrays 
for performing mass screenings for genetic research and 
diagnostic applications. Numerous microarrays can be fab- 
ricated on the same solid support and each microarTay 
reacted with a different DNA probe while the solid support 
is processed as a single sheet of material. 



*lne following examples illustrate, but in no way are 
intended to limit, the present invention. 

EXAMPLE 1 

5 Gcnomic-Complcxity Hybridization to DNA 

Microarrays Representing the Yeast Saccharomyces 
cemisuse Genome with Two-Color Fluorescent 
Detection 

1C The array elements were randomly amplified PCR 
(Bohlander, et ai, 1992) products using physically mapped 
lambda dunes of 5. cerevisdav. genomic DNA as templates 
(Riles, ct aU 1993). Ibe PCR was performed directly on the 
lambda phage lysates, resulting in an amplification of both 
the 35 kb lambda vector and the 5-15 kb yeast insert 
sequences in the form of a uniform distribution of PCR 
product between 25Q-1500 base pairs in length, 'lne PCR 
product was purified using Sepnadex G50 gel filtration 
(Pharmacia, Ptscataway, NJ.) and concentrated by evapo- 
■ ration to dryness it room temperature overnight. Each of the 
864 amplified lambda dooes was renydrated in 15 /d of 
3xSSC in preparation for spotting onto the glass. . 

The microarrays were fabricated on microscope slides 
which were coated with a layer of poly-Mysine (Sigma). The 

25 automated apparatus described in Section III loaded 1 fd of 
the concentrated lambda clone PCR product in 3xSSC 
directly from 96 well storage plates into the open capillary 
printing element and deposited -5 nl of sample per slide at 
380 micron spacing between spots, on each of 40 slides. The 

3Q process was repeated for all 864 samples and 8 control spots. 
After the spotting operation was complete, the slides were 
renydrated in a humid chamber for 2 hours, baked in a dry 
80* vacuum oven for 2 hours, rinsed to remove unabsorbed 
DNA and then treated with succinic anhydride to reduce 

35 non-specific adsorption of the labeled hybridization probe to 
the poly-l-lysine coated glass surface. Immediately prior to 
use, the immobilized DNA on the array was denatured in 
distilled water at 90* for 2 minutes. 
For the pooled chromosome experiment, the 16 chromo- 

40 somes of Saccharomyces ccrcvisiac were separated in a 
CHEF agarose gel apparatus (Biorad, Richmond, Calif.). 
The six largest chromosomes were isolated in one gel slice 
and the ten smallest chromosomes in a second gel slice. The 
DNA was recovered using a gel extraction kit (Qiagen, 

45 Chatsworth, Calif.). The two chromosome pools were ran- 
domly amplified in a manner similar to that used for the 
target lambda dooes. Following amplification, 5 micro- 
grams of each of the amplified chromosome pools were 
separately random-primer labeled using Klcnow polymerase 

so (Amersbam, Arlington Heights, 111.) with a lissamine con- 
jugated nucleotide analog (Dupont NHN, Boston, Mass.) for 
the pool containing the six largest chromosome*, and with a 
fluorescein conjugated nudeotide analog (BMB) for the 
pool containing ten smallest chromosomes. The two pools 

55 were mixed and concentrated using an ultrafiltration device 
(Amicon, Din vers, Mass.). 

Five micrograms of the hybridization probe consisting of 
both chromosome pools in IS /d of TE was denatured in a 
boiling water bath and then snap cooled on ice. 2.5 /d of 

60 concentrated hybridization solution (5xSSC and 0.1% SDS) 
was added and all 10 fA transferred to the array surface, 
covered with a cover slip, placed in a custom-built single - 
slide humidity chamber and incubated at 60* for 12 hours. 
The slides were then rinsed at room temperature in 0.1 xSSC 

c$ and 0.1% SDS for 5 minutes, cover slipped and scanned. 
A custom built laser fluorescent scanner was used to 
detect the two-color hybridization signals from the 1.8x1.8 
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cm array at 2U micron resoluiioo. 'ltz scanned image was 
gridded and analyzed using custom image analysis software. 
After correcting for optical crosstalk between the fluoro- 
phorcs due to their overlapping emission spectra, the red and 
green hybridization values for each clone on the array were 5 
correlated to the known physical map position of the done 
resulting in a computer-generated color karyotype of the 
yeast genome. 

FIG. 6 shows the hybridization pattern of the two chro- 
mosome pools. A red signal indicates thai the lambda clone 10 
on the amy surface contains a cloned genomic DNA seg- 
ment from one of the six largest yeast chromosomes. A green 
signal indicates that the lambda clone insert comes from one 
of the ten smallest yeast chromosomes. Orange signals 
indicate repetitive sequences which cross hybridized to both 15 
chromosome pools. Control spots on the array confirm that 
the hybridization is specific and reproducible. 

The physical map locations or the genomic DMA frag- 
ments contained in each of the clones used as array elements 
have been previously determined by Olson and co-workers 20 
(RDcs, ct aL), allowing for the automatic generation of the 
color karyotype shown in FIG. 7. The color of a chromo- 
somal section on the karyotype corresponds to the color of 
the array element containing the clone from that section. The 
black regions of the karyotype represent false negative dark 25 
spots on the array (10%) or regions of the genome not 
covered by the Olson clone library (90%). Note that the six 
largest chromosomes are mainly red while the ten smallest 
chromosomes are mainly green, thus matching the original 
CHEF gel isolation of the hybridization probe. Areas of the 50 
red chromosomes containing green spots and vice-versa are 
probably due to spurious sample tracking errors in the 
formation of the original library and in the amplification and 
spotting procedures. 

The yeast genome arrays have also been probed with 
individual clones or pools of clones that arc fiuorescently 
labeled for physical mapping purposes. The hybridization 
signals of these clones to the array were translated into 
positions on the physical map nf the yeast genome. 
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EXAMPI£2 

Total cDNA Hybridized to Micro Arrays of cDNA 
Clones with Two-Color Fluorescent Detection 



45 



Tweoty-four clones containing cDNA inserts from the 
plant Arabidopsis were amplified using PCK. Salt was added 
to the purified PCR products to a final concentration of 
3xSSC. The cDNA clones were spotted on poly-Mysine 
coated microscope slides in a manner similar to Example 1. $q 
Among the cDNA clones was a clone representing a tran- 
scription factor RAT4 f which had previously been used to 
create a transgenic line of the plant Arabidopsis, in which 
this gene is present at ten times the level found in wild-type 
Arabidopsis (Scnena, et aU 1992). 55 

Total poly-A mRNA from wild type Arabidopsis was 
isolated using standard methods (Manialts, ct al., 1989) and 
reverse transcribed into total cDNA, using a fluorescein 
nucleotide analog to label the cDNA product (green 
fluorescence). A similar procedure was performed, with the 
transgenic line of Arabidopsis where the transcription factor 
HA1'4 was inserted into the genome using standard gene 
transfer protocols. cDNA copies of mRNA from the trans- 
genic plant are labeled with a lissamine nucleotide analog 
(red fluorescence). Two micrograms of the cDNA products cs 
from each type of plant were pooled together and hybridized 
to the cDNA clone array in a 10 microliter hybridization 



reaction id a manner similar to Example 1. Rinsing and 
detection of hybridization was also performed in a manner 
similar to Example 1. FIG. 8 shows the resulting hybridiza- 
tion pattern of the array. 

Genes equally expressed in wild type and the transgenic 
Arabidopsis appeared yellow due to equal contributions of 
the green and red fluorescence to the final signal. The dots 
are different intensities of yellow indicating various levels of 
gene expression. Tne cDNA clone representing toe tran- 
scription factor HAT4, expressed in the transgenic line of 
Arabidopsis but not detectably expressed in wild type 
Arabidopsis, appears as a red dot (with the arrow pointing to 
it), indicating the preferential expression of the transcription 
factor in the red-labeled transgenic Arabidopsis and the 
relative lack of expression of the transcription factor in tne 
green-labeled wild type Arabidopsis. 

An advantage of the micmarray hybridization format for 
gene expression studies is the high partial concentration of 
each cDNA species achievable in the 10 microliter hybrid- 
ization reaction. This high partial concentratioo allows for 
detection of rare transcripts without the need for PCR 
amplification of the hybridization probe which may bias the 
true genetic representation of each discrete cDNA species. 

Gene expression studies such as these can he used for 
genomics research to discover which genes are expressed in 
which cell types, disease states, development stales or 
environmental conditions. Gene expression studies can also 
be used for diagnosis of disease by empirically correlating 
gene expression patterns to disease states. 

EXAMPLE 3 

Multiplexed Cblorimetric Hybridization on a - 
Gridded Solid Support 

A sheet of plastic-backed nitrocellulose was gridded with 
barrier elements made from silicone rubber according to the 
description in Section IV-A. The sheet was soaked in 
IQxSSC and allowed to dry. As shown in FIG. 12, 192 M13 
clones, each with a different yeast inserts were arrayed 400 
mienmx apart m four quadrants of the solid support using the 
automated device described in Section 111. Tne bottom left 
quadrant served as a negative control for hybridization, 
while each of the other three quadrants was hybridized 
simultaneously with a different oligonucleotide using the 
open -face hybridization technology described in Section 
IV-A. The first two and last four elements of each array arc 
positive controls for the calorimetric detection step. 

The oligonucleotides were labeled with fluorescein, 
which was detected using an anti-fluorescein antibody con- 
jugated to alkaline phosphatase that precipitated an KBT/ 
BOP dye on the solid support (Amcrsbam). Perfect matches 
between the labeled oligos and the M13 clones resulted in 
dark spots visible to the naked eye and detected using an 
optical scanner (HP ScanJet II) attached to a personal 
computer. The hybridization patterns are different in every 
quadrant indicating that each oligo found several unique 
M13 clones from among the 192 with a perfect sequence 
match. Note that the open capillary printing tip leaves 
detectable dimples on the nitrocellulose which can be used 
to automatically align and analyze the images. 

Although the invention has been described with respect to 
specific embodiments and methods, it will be dear that 
various changes and modification may be made without 
departing from the invention. 
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We claim: 

L A method of fonning a microarray of discrete analyte- 
assay regions oo a solid support, where each discrete region 
is the microarray has a selected, ana] yu -specific reagent, 
said method comprising, 

(a) loading ao aqueous solution of a selected analyte- 
specific reagent in a reagent-dispensing device having 
an elongate capillar)' channel adapted tn hold a quantity 
of the reagent solution and having a tip region at which 
the solution in the channel forms a mfoitnis, . 

(b) tapping the tip of the dispensing device against a solid 
support at a defined position on the surface, with ao 
impulse effective to break the meniscus in the capillary 
channel and deposit a selected vohunt between 0.002 
and 2 nl of solution on the surface, and 

(c) repeating steps (a) and (b) until said microarray is 
formed. 

Z The method of claim 1, wherein the reagents used to 
form the discrete regions in the mkroarray arc distinct 
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nucleic acid strands and wherein steps (a) and (b) are 
repeated until the microarray has about 100 or more discrete 
regions of distinct nucleic acid strands per cm 2 of solid 
support 

3. The method of claim 1, wherein the reagents used to 
form the discrete regions in the microarray are distinct 
nucleic acid Mrands and wherein steps (a) and (h) arc 
repeated until the microarray has about 1000 or more 
discrete regions of distinct nucleic acid strands per cm* of 
solid support. 

4. The method of claim 2, wherein the channel is open- 
sided. 

5. The method of claim 3, wherein the channel is open- 
sided. x 

6. The method of claim 4, wherein the volume is between 
0.002 and 0.25 nL 

7. The method of claim 5, wherein the volume is between 
0.002 and 0.25 ni 



